Graduate School Statistics Interview

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/22

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

23 Terms

1
New cards

Explain your story

Hi, my name is Caroline Cordes I am currently a senior data science major with a concentration in statistics and a minor in finance. I grew up in houston texas as a kid that loved math and science and wanted to find a way to combine that with buesiness. I got a postcard my senior year of high school from the University of Arkansas data science program and decided that it was the perfect fit. I’ve had a wonderful opportunity to get involved over my past 3 years here. I’ve helped start the first data science registered student organization on campus, got involved in mentorship programs one being first year data science students and the other being a local literacy program at the library, i’m also the manager of a softball analytics team which partners with the softball team looking at defensive positioning and the kpis of players and im a tour guide on campus! During the summers i’ve had the privlage of interning. 2 summers ago i interned at Haleon as a business analyst on their walmart sales team working on their cold, cough, flu, allergy and lip team. I mainly worked on the capstick brand looking at market basket reports on what type of consumer was purchasing chapstick and at what locations in the store. And then last summer I interned at mastercard as an associate consultant intern where i worked on two clients one of them being walmart finance which deals with capital expenditure (every time they spend money e.g new shelving) and the other one being a newer client prosperity bank which is a regional bank in texas and oklahoma. I did a lot of A/B testing in this role to determine the effect of the different changes made in stores. I’ve really enjoyed my time in these roles but i'm ready to take the next step and further my knowledge and grow as not only a data scientist but grow professionally as well.

2
New cards

Why NC State? Why data analytics/this program? What are you most excited about for the program?

NC state, and specicially the institute of advanced analytics, first came up on my radar when a guest speaker from the institute, Aric LaBarr, came to one of my data science introduction courses which sparked me to look into a bit more. I’ve always knew that I wanted a graduate degree as I know the doors it can open and skills it can develop, and with the goal of becomming a data scientist and a future leader having a master degree felt like the right next step for my career. When I was looking at schools I wanted a school that would offer a insustry relevant education and have opportunities for professional growth, and NC state was top of mind and felt like the perfect fit. I’m really looking forward to the opportunity to participate in a practicum, i’ve been able to learn and grow a lot this semester with my current project and im excited to contunite and apply my knowledge to a real world project and enjoy the learning process that comes with that. I think its a really unique opportunity to gain indistry relevant experience and I would be honored to be a part of it .

3
New cards

Where do you want to live after graduation?

I am pretty flexible rearding where I want to live in the future. I think that staying in north carolina would be prefered, but the most important thing to me is the company and finding the right culture and fit.

4
New cards

What industry do you want to work in?

Again, also an area that i’m flexible. I am intrested in the finance sector, but ive also had a great experience in the consumer product goods and consulting industries. I think regardless of industry im looking for a workplace with a clear growth trajectory, invests in new hires, and is continually evolving.

5
New cards

Teamwork was emphasized: examples of working in a team, problems with a teammate, how would teammates describe you?

I love working in teams. I’ve had many opportunities to work in teams and every time i do i feel like I am able to learn and collaborate more efficently every time i work in a team i feel like. Teamates have described me as reliable and supportive. There have been instances where I’ve been on teams with teamates that ive had disagreemnts with and relaibility issues. The most important thing to me whenever I

6
New cards

What do you think will be your biggest challenge in the program.

I think every time I enter a new enviroment whether that be a new company or even a new class there is a learning curve that come with it. It can me a little bit to find my grounding and start to gain more confidence

7
New cards

Where do you see yourself in 5 years?

I see myself working in a data focused job, whether that be as a data scientist or analst, where I am a seen as a trusted and diligent worker thats on the path to a leadership or managerial role.

8
New cards

Explain a project that you’re passionate about

Currently my data science practicum project we are working with sam’s club doing product embedding for keyword search, meaning we are utilizing and finetuning diferent large langauge models to see if we can provide more accurate product results for member search queries. Right now we are trying to replicate the results of an amazon research paper, so I and a teamate have created code to create embeddings on 2 sentence transformer models and 3 bert like models, a mulit layer precepretron model that has an initial dense linear layer, maxpooling, 10% dropout, and then a final classification layer that will output ESCI labels meaning exact, substitue, complement, and irrelevant, put that through trainng and loss functions, and have created confusion matricies that we get to present to leadership in the next few months.

9
New cards

What is an ARIMA Model

autoregressive integrated moving average (ARIMA) model is a statistical model that uses time series data to understand past events and predict future trends. Example: Imagine that you're buying ice cream to stock a small shop. If you know that sales of ice cream have been rising steadily as the weather warms, you should probably predict that next weeks order should be a little bigger than this weeks order. How much bigger should depend on the amount that this weeks sales differ from last weeks sales

10
New cards

What is linear regression? How would you explain it to a non-technical audience?

Linear regression is a method that uses a straight line, or “line of best fit”, to describe the relationship between 2 variables. One of them being dependent and the other being independent, meaning it helps us predict/estimate the the value of one variable, like the crispiness of bread, based on the value of another variable (such as toasting time)

11
New cards

AIC/BIC Models

AIC: Akaline Information Criterion selects the model that minimizes mean squared error of prediction or estimation (wants to find the best predictive model)

BIC: Bayesian information Criterion applies a stronger penalty for additional parameters making it more likley to choose simpler models. (prioritizes finding the true model)

12
New cards

Explain neural networks

13
New cards

What is a p-value?

A p-value aka probability value is a number that describes the probability to which the data supports the null hypothesis. Most of the time it will be a smaller p-value will have greater evidence against the null (reject the null hypothesis). And a large p-value means that you fail to reject the null hypothesis. Normally the threshold is 0.05.  

T-test = when sample is small and we don't know the population's variance  

Z-test = when sample is large and variance is known

14
New cards

What is Principle Component Analysis?

Summarizes the information content of large datasets into a smaller set of uncorrelated variables (which are the the principle components). Preforms linear mapping.

15
New cards

Explain Supervised v. Unsupervised learning?

Supervised learning uses labeled input and output to make predictions - regressions, decison tree, and neural networks. Unsuperised learnin is when there is no labeled data

16
New cards

How would you write out code for the following concepts

17
New cards

Explain A/B testing

Where you compare two versions, a test and a control, to see the effect of a change - see if its preforming worse/better

18
New cards

How would you read a confusion matrix?

Used to evaluate the preformace of a classification model. Compare the actual labels v. predicted labels.

Actual positive True Positive False negative

Actually negative False positive True negative

<p>Used to evaluate the preformace of a classification model. Compare the actual labels v. predicted labels.</p><p></p><p></p><p>Actual positive True Positive False negative</p><p>Actually negative False positive True negative</p>
19
New cards

Questions for them?

What skills of characteristics do you see in sucessful candidates?

What kind of coding languages are emphasized?

20
New cards

What is a Markov Chain?

A markov chain is a way to model a system that moves between different states over time, where the future state only depends on the current state - not how it got there.

21
New cards

What is an ANOVA test and how to read it?

Analysis of variance test. Used to determine whether there are significant differences between the means of independent groups - usually 2+. Helps assess if at least one group mean is statistically different from the others. 

F-Statistic: Ratio of the variance between groups to the variance within groups:

P-Value: is p<alpha then you reject the null hypothesis, concluding that there are significant differences.

22
New cards

Explain your experience with diffferent programming languages.

R: Most of my statistics classes have taken place in R- multivariable math, statistical learning, and my current time series class. This is one of the programs that I am most familiar with and am comfortable programming in. At the moment, im generating softball reporting using data using ball tacking csv’s that are provided by the coach. I use ggplot, seaborn, arima, time series, etc 

Python: This is the programming language that I'm most familiar with. My practicum project is all being coded in python and my machine learning class is also in python at the moment. I typically use jupyter notebooks and enjoy using scikit learn and pandas when coding in python. 

SQL: I've taken a class in SQL where our final a fake interface of an ATM that would allow people in out class to add/withdraw money which was all done in visual studio using SQL and C#. I also did a bit of SQL in my internship last summer where I would look at first interaction using tap-to-pay and if that had any incremental impact on spending 

Java: Object oriented programming class my my first year of college, but it has been a while since ive programmed in in but i am still confident in my ability to read, understand, and do some preliminary coding in


23
New cards