Sometimes, you may want to create a dataset manually in R. This approach is useful for:
Testing small examples.
Learning the structure of data in R.
Quickly experimenting with specific cases.
While manually typing data is rarely practical for larger datasets, it can help you understand how data is organized and manipulated in R.
11.1 Creating a Data Frame
In R, data is commonly stored in data frames or tibbles. You can create a small data frame manually using the tibble() function from the tidyverse.
Here’s an example:
## Manually creating a small datasetdata <-tibble(name =c("John", "Jane", "Alice", "Bob"),age =c(25, 30, 28, 22),occupation =c("Engineer", "Doctor", "Artist", "Student"))data
# A tibble: 4 × 3
name age occupation
<chr> <dbl> <chr>
1 John 25 Engineer
2 Jane 30 Doctor
3 Alice 28 Artist
4 Bob 22 Student
Note
When entering data:
Use quotes (“) for string values (e.g.,”John”).
Do not use quotes for numeric values (e.g., 25).
End each argument with a comma (,), except for the last argument.
This creates a tibble with three variables: name, age, and occupation.
Exercise: Typing Data
Try it yourself:
Starting with the code that manually creates a dataset named data, change the code to add an additional variable after age named gender where the gender of John is “Male”, Jane is “Female”, Alice is “Female”, and Bob is “Male”.
The variables name, age, and occupation are created by a new line with variable_name = c(value1, value2, value3, value4). What can you do to add a variable for gender as described above?
Hint 2
Add a line after age and define the gender variable with values of “Male”, “Female”, “Female”, “Male.”
gender =c(""Male", "Female", "Female", "Male")
Fully worked solution:
Add a line after age and define the gender variable with values of “Male”, “Female”, “Female”, “Male”.
Create a dataset named data by calling the tibble function
2
Create the name variable with values of “John”, “Jane”, “Alice”, “Bob”
3
Create the age variable with values of 25, 30, 28, 22
4
Create the gender variable with values of “Male”, “Female”, “Female”, “Male”
5
Create the occupation variable with values of “Engineer”, “Doctor”, “Artist”, “Student”
6
Close the tibble function call with ) to enclose the arguments of the function
11.2 When to Use Manual Entry
Typing data manually can be useful in the following cases:
Small, quick experiments with data structures.
Building toy datasets for testing functions.
Practicing data manipulation techniques.
However, for most real-world datasets, manual entry is impractical. As your datasets grow larger or come from external sources, you’ll need more efficient methods, such as importing data from files or databases.