1/44
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is supervised vs unsupervised learning?
Supervised learning uses labeled data to predict a target (classification/regression). Unsupervised learning finds structure in unlabeled data (clustering/dimensionality reduction).
What is overfitting?
When a model learns noise instead of patterns, performing well on training data but poorly on new data. Prevent with regularization, cross‑validation, simpler models, or more data.
What is cross-validation?
A technique (often k-fold) that evaluates model performance across multiple train/test splits to reduce overfitting and estimate generalization.
What is the train/validation/test split?
Train: learn patterns. Validation: tune hyperparameters. Test: final unbiased evaluation.
What is regularization?
Adding penalties (L1/L2) to discourage overly complex models and reduce overfitting.
What is EDA?
Exploratory Data Analysis — understanding distributions, relationships, anomalies, and guiding feature engineering and modeling.
What is feature scaling?
Normalizing or standardizing features so models like KNN, SVM, and gradient descent behave correctly.
What is classification vs regression?
Classification predicts categories; regression predicts continuous values.
What is a confusion matrix?
A table showing TP, FP, TN, FN — used to evaluate classification performance.
What is precision vs recall?
Precision: of predicted positives, how many were correct. Recall: of actual positives, how many were found.
What is the bias–variance tradeoff?
Bias: error from overly simple models. Variance: error from overly complex models. Goal: balance both.
What is SQL used for in data science?
Querying, filtering, aggregating, joining, and transforming structured data.
How do you handle missing data?
Drop rows/columns, fill with mean/median/mode, forward/backward fill, or model-based imputation depending on context.
How do you detect outliers?
Boxplots, z-scores, IQR method, domain knowledge, or anomaly detection models.
How do you merge datasets in Pandas?
Using pd.merge() with keys and join types (inner, left, right, outer).
Difference between .loc and .iloc?
.loc is label-based indexing; .iloc is integer-position-based indexing.
How do you improve model performance?
Feature engineering, hyperparameter tuning, regularization, trying different algorithms, or collecting more data.
What is a p-value?
Probability of observing results at least as extreme as the current data assuming the null hypothesis is true.
What is correlation?
A measure of linear relationship between variables (ranges from -1 to 1).
What is the central limit theorem?
The mean of many samples approaches a normal distribution regardless of the original distribution.
Tell me about yourself
“Hi, I’m Victor — I’m completing my M.S. in AI at NJIT, and I’ve taken a lot of coursework in machine learning, data analysis, and statistics. I enjoy working with real data, cleaning it, exploring it, and building models that help answer real questions. I’m excited about Twenty20 Systems because you focus on data integration and analytics that drive real business outcomes, and I’d love to contribute and learn from your team.”
Tell me about a project you worked on
Use structure: Problem → Data → Approach → Tools → Results → What you learned. Pick from your resume: CaloriePics, Domino Game, Recursive Compiler, or NDA project.
Tell me about your CaloriePics project
“I built a full-stack calorie estimation app using React and Flask. I integrated OpenAI’s API for image-based nutritional analysis, handled JSON responses, and built a clean UI with file uploads, loading states, and dynamic rendering. It taught me how to combine AI APIs with real-time frontend interactions.”
Tell me about your Domino Game project
“I developed a real-time multiplayer domino game inside a React app, collaborating with a distributed team. I worked on synchronized gameplay, code reviews, and version control — strengthening teamwork and frontend engineering skills.”
Tell me about your Recursive Compiler project
“I built a recursive compiler in C++ using lexical analysis, syntax trees, and memory management. It deepened my understanding of formal languages and complex system design.”
What is your weakness?
“I sometimes over-focus on getting the analysis perfect, but I’ve been working on balancing thoroughness with speed by setting checkpoints and sharing early drafts with teammates.”
What do you do when you're stuck?
“I break the problem down, check assumptions, test small cases, review documentation, and then ask teammates if needed.”
Why do you want this internship?
“Because it’s the perfect bridge between my academic ML background and real-world data integration and analytics. I want to contribute to projects that help clients make smarter decisions.”
Why should we hire you?
“I bring strong analytical skills, hands-on project experience, and a genuine interest in applying data science to real business problems. I learn quickly, communicate well, and I’m excited to contribute to your mission.”
What does Twenty20 Systems do?
They build data-driven and AI-enabled solutions focused on data integration, business automation, and analytics to help customers make smarter decisions.
What is Twenty20’s mission?
“Your Vision is our Mission” — helping customers succeed through technology adoption.
What are Twenty20’s values?
Customer-focused, people-centric, inclusive, committed to success, professional, collaborative.
Why are you interested in Twenty20 Systems?
They focus on modern architectures, analytics, and real-world impact — perfect for applying academic learning to practical business problems.
How does your background connect to their mission?
“I’ve worked with Python, data analysis, and ML in coursework and projects, and I’m excited to apply those skills to help clients accelerate outcomes through data-driven decisions.”
A role-focused question you could ask
“What does a typical week look like for a data science intern here?”
Another role-focused question
“What kinds of datasets or business problems would I be working on?”
A collaboration question
“How does the team collaborate — standups, code reviews, pair programming?”
A learning-focused question
“What skills or tools do successful interns usually pick up during the internship?”
A mentorship question
“How do you support learning and growth for interns?”
A company-focused question
“How does Twenty20 approach data integration and analytics for clients?”
A culture question
“How would you describe the team culture here?”
A future-projects question
“What upcoming initiatives is the data/AI team excited about?”
Your 20-second pitch
“Hi, I’m Victor — I’m completing my M.S. in AI at NJIT, and I’ve taken a lot of coursework in machine learning, data analysis, and statistics. I enjoy working with real data, cleaning it, exploring it, and building models that help answer real questions. I’m excited about Twenty20 Systems because you focus on data integration and analytics that drive real business outcomes, and I’d love to contribute and learn from your team.”
What to say if you lack experience
“Most of my experience comes from coursework and projects, but I’ve worked with Python, Pandas, NumPy, and Scikit-learn to clean data, run EDA, and build basic models. I’m excited to apply those skills to real business data.”
What to say if asked about a tool you don’t know
“I haven’t used it directly yet, but I understand the concepts and I learn new tools quickly.”