Midterm Exam Practice – In Class Portion
This exam is quite a bit longer than the exam I will give you in class, but I want you to have a feel for the types of questions I might ask.
You can find the git repository for this midterm at https://github.com/stat-assignments/2025-850-practice-midterm.
You should be able to clone it to your machine, but you should not be able to push your results back to the repository.
Note – these directions are copied directly from the exam.
This exam is due at the end of class on October 23, 2025.
I will grade the exam as it is pushed to github. I cannot grade the exam that exists on your computer, so please double-check your github repository to ensure that the file that is on github is the file you want me to grade.
For each of these problems, you may choose to solve the problem in either R or python. I have provided both R and python chunks. You should feel free to remove the unused chunk for each question.
(5 points) Please put your code under the comment corresponding to the part you are working on. This will help me to grade your work more efficiently.
Rules
You may use the textbook and your notes on this exam.
If you need to search for ‘how do I do X in Y language’, that is allowable using google/duckduckgo, but you must 1) document that you did the search, and 2) provide a link to the website you used to get a solution. AI results should be turned off by adding ” -ai” to your query.
AI and LLM usage is strictly forbidden. Use of any unauthorized resources will result in a 0 on this exam.
You must be able to explain how any code you submit on this exam works.
- Oral exams based on your submissions will be held the week of October 28, 2025.
- You will be notified of the need for an oral exam by Monday, October 28.
- If you are notified that an oral exam is required, you must schedule a time for the exam within 24h.
If you get stuck, you may ask Dr. Vanderplas for the solution to the problem you are stuck on, at the cost of the points which would be awarded for that problem. This is designed to get you un-stuck and allow you to complete multi-part problems.
(5 points) Your submitted qmd file must compile without errors. Use
error=TRUEin a chunk if it is supposed to return an error or if you cannot get the code to work properly.
Least Common Multiple
In this problem, you will work towards building code that will find the least common multiple of two numbers.
The least common multiple of \(a\) and \(b\) is the number \(c\) for which \(c/a\) and \(c/b\) both produce integer values.
We’ll start by installing a package that helps work with prime numbers… cheapr in R, math in Python.
Skill: Install Packages
R
Write code to install the cheapr package in R. Run the code.
This practice problem has no equivalent install command in python (math is installed by default), but you should know how to install a python package as well, either via a terminal/bash chunk or within a python chunk using iPython magic commands.
Thinking Criticially
How could you ensure that the code above to install the package isn’t evaluated? Think of at least two ways.
Skill: Loading packages
Load the packages you just installed in R and Python using the chunks below.
R
Python
Skill: Using prewritten functions
R
Use the scm2(x, y) function to find the smallest common multiple between 35 and 49.
Python
Use the math.lcm(a, b) function to get the smallest common multiple between 3726 and 9321. Store this number in a variable called multiple.
Skill: Matrices and Loops
You have a vector of values: \(x = [23, 81, 264, 198, 261, 18, 35]\).
Create a 7x7 matrix where \(M_{ij}\) contains the least common multiple of \(x_i\) and \(x_j\).
R
Python
Skill: Creating Data Frames
You have a vector of values: \(x = [23, 81, 264, 198, 261, 18, 35]\).
R
Use the tidyr function expand_grid to create a data frame with columns \(x\) and \(y\), each containing values of \(x\). The resulting data frame should have 49 rows.
Python
Use a list comprehension to create a DataFrame (perhaps from a list) with columns \(x\) and \(y\), each containing values of \(x\). The resulting data frame should have 49 rows and two columns.
Skill: Creating New Variables
Create a new column in your data frame, lcm, from the previous step, containing the least common multiple between columns \(x\) and \(y\).
R
Python
Skill: Data Transformations
Use your skills at data transformations to convert your data frame lcm column into a data frame that has a similar form to your matrix – e.g. columns named yXXX where XXX is the number representing y.
R
Python
Skill: Subsets and Indexing
Get all rows in your (long-form) data frame where \(x\) and \(y\) have no common factors (that is, LCM(x, y) = x*y). Store these rows in a data frame called no_common_factors.
R
Python
Skill: Writing Functions & Type Conversion
In the previous section, you determined the LCM of values \(x\) and \(y\).
In R and python, write a function, gcd(x, y), which will find the greatest common divisor between \(x\) and \(y\).
Note: Your GCD should be an integer.
Hint: Can you use the LCM of \(x\) and \(y\) to find the GCD?
Planning
Using the provided scratch paper (please put your name at the top), sketch a basic program flow map that shows how the code you’ve already written fits together to solve this problem. Identify any bits of logic you need to write to solve the problem.
My solution is sketched out on sheet ___
R
Python
Skill: Data Frames, Loops
Use your function to create a new column in your data frame, gcd, and populate it using a loop.
R
Python
Skill: String Operations
Take the data frame you created in the previous problem and write a format_results(df) function that will output the results of each pair of \((x, y)\) as “The LCD of (a, b) is c and the GCM is d”, where a, b, c, d are the values x, y, LCD, and GCM.
Hint:
- Python: in a DataFrame, you can convert the whole column to a string using
df.colname.astype("str")(replace df, colname with appropriate data frame name and column name)
R
Python
Skill: Control Statements
Modify the code you wrote in the previous section. If \(x\) and \(y\) have no common factors, instead of outputting the LCM and GCD, output “__ and __ have no common factors”.
Planning
What modifications will you need to make to handle this additional requirement?
How can you use previously written code and functions to accomplish this task?
What additional code do you need to write?
R
Python
Skill: User-proofing your function
It is never safe to assume that your user knows what they are doing. Can you make your function from the previous part more robust by testing the user input to ensure that it conforms to your expectations?
Planning
What assumptions does your previous answer make about parameters?
What do you need to test to ensure those assumptions are met?
R
Python
Skill: Summarizing Data
For each \(x\), create a data frame with the \(y\) that has the greatest common denominator.
R
Python
Additional Questions (out of class portion)
Be prepared to identify which functions do which things in e.g. matching or multiple choice questions
Be prepared to summarize the arguments made in different “practical” reading articles
A key for this exam can be found here.