R编程 初学者 代码整理
Collecting of R Code (Noob)
R Packages
install commands like
install.packagesandinstall_github
dolibraryfunction to load them
dplyr: a grammar of data manipulationggplot2: for data visualization
Basic
- Load Data:
data()from data files orload() - Dimensions of Data:
dim(data_file_name) - Names of variables:
names(data_file_name) - Single column of data: data_file_name
$variable_name - Data structure:
str(data_file_name) - Create new data from old: new_data_file_name
<-data_file_name - Descending order:
desc - If else:
ifelse(test, yes, no) - Counting streak lengths:
calc_streak(data_file_name) - Table:
table(data) - Search the names for a fragment of the name:
grep("search_name", names(data_file_name), value = TRUE)
dplyr
data_file_name %>%group_by(variable_name) %>%:grouped the data by originmutate(): adds new variables that are functions of existing variablesdata_file_name <- data_file_name %>% mutate(new_variable = variable_A simple_mathematical_operator variable_B)select(): picks variables based on their names.filter(): picks cases based on their values.new_data_file_name <- data_file_name %>% filter(variable_name logical_operator filter_condition)- find the one is na:
filter(is.na(variable_name))
- find the one is na:
summarise(): reduces multiple values down to a single summary.data_file_name %>% summarise(mean_variable = mean(variable_name), sd_variable = sd(variable_name), n = n())arrange(): changes the ordering of the rows.data_file_name %>% summarise(median_variable = median(variable_name)) %>% arrange(desc(median_variable))desc: Descending order
distinct(): select distinct/unique rowssample(): simulation, uses random numbers to generate an outcomedata_outcomes <- c("variable_outcome_A", "variable_outcome_B") sample(data_outcomes, size = 100, replace = TRUE, prob = c( #variable_A_prob, #variable_B_prob ))
ggplot2
- Making graphics:
ggplot(data = data_file_name, aes(x = x_aes_name, y = y_aes_name)) + geom_line() + geom_point() -
Making side-by-side box plots:
geom_boxplot() - Making segmented bar plot:
ggplot(data = data_file_name, aes(x = x_aes_name, fill = variable_name)) + geom_bar() - Making histogram:
ggplot(data = data_file_name, aes(x = x_aes_name)) + geom_histogram(binwidth = #number )


