11  Typing Data Manually

Creating Small Data Frames in R

Sometimes, you may want to create a dataset manually in R. This approach is useful for:

  1. Testing small examples.
  2. Learning the structure of data in R.
  3. Quickly experimenting with specific cases.

While manually typing data is rarely practical for larger datasets, it can help you understand how data is organized and manipulated in R.



11.1 Creating a Data Frame

In R, data is commonly stored in data frames or tibbles. You can create a small data frame manually using the tibble() function from the tidyverse.

Here’s an example:

## Manually creating a small dataset
data <- tibble(
  name = c("John", "Jane", "Alice", "Bob"),
  age = c(25, 30, 28, 22),
  occupation = c("Engineer", "Doctor", "Artist", "Student")
)
data
# A tibble: 4 × 3
  name    age occupation
  <chr> <dbl> <chr>     
1 John     25 Engineer  
2 Jane     30 Doctor    
3 Alice    28 Artist    
4 Bob      22 Student   
Note

When entering data:

  • Use quotes (“) for string values (e.g.,”John”).
  • Do not use quotes for numeric values (e.g., 25).
  • End each argument with a comma (,), except for the last argument.

This creates a tibble with three variables: name, age, and occupation.

Exercise: Typing Data

Try it yourself:


Starting with the code that manually creates a dataset named data, change the code to add an additional variable after age named gender where the gender of John is “Male”, Jane is “Female”, Alice is “Female”, and Bob is “Male”.

Hint 1

The variables name, age, and occupation are created by a new line with variable_name = c(value1, value2, value3, value4). What can you do to add a variable for gender as described above?

Hint 2

Add a line after age and define the gender variable with values of “Male”, “Female”, “Female”, “Male.”

  gender = c(""Male", "Female", "Female", "Male")

Fully worked solution:

Add a line after age and define the gender variable with values of “Male”, “Female”, “Female”, “Male”.

1data <- tibble(
2  name = c("John", "Jane", "Alice", "Bob"),
3  age = c(25, 30, 28, 22),
4  gender = c("Male", "Female", "Female", "Male"),
5  occupation = c("Engineer", "Doctor", "Artist", "Student")
6)
data
1
Create a dataset named data by calling the tibble function
2
Create the name variable with values of “John”, “Jane”, “Alice”, “Bob”
3
Create the age variable with values of 25, 30, 28, 22
4
Create the gender variable with values of “Male”, “Female”, “Female”, “Male”
5
Create the occupation variable with values of “Engineer”, “Doctor”, “Artist”, “Student”
6
Close the tibble function call with ) to enclose the arguments of the function


11.2 When to Use Manual Entry

Typing data manually can be useful in the following cases:

  • Small, quick experiments with data structures.
  • Building toy datasets for testing functions.
  • Practicing data manipulation techniques.

However, for most real-world datasets, manual entry is impractical. As your datasets grow larger or come from external sources, you’ll need more efficient methods, such as importing data from files or databases.