Define Data Science |
Define Data Profiling |
What is regression? Which models can you use to solve a regression problem? |
What is linear regression? When do we use it? |
What is gradient descent? How does it work? |
What is the normal equation? |
What is SGD — stochastic gradient descent? What’s the difference with the usual gradient descent? |
What is overfitting? |
What is K-fold cross-validation? |
What is classification? Which models would you use to solve a classification problem? |
What is logistic regression? When do we need to use it? |
What is sigmoid? What does it do? |
What is accuracy? |
What is the confusion table? What are the cells in this table? |
What is precision, recall, and F1-score? |
What is the ROC curve? When to use it? |
What is AUC (AU ROC)? When to use it? |
What is the PR (precision-recall) curve? |
What is the area under the PR curve? Is it a useful metric? |
What is regularization? Why do we need it? |
What is feature selection? Why do we need it? |
What is random forest? |
What is gradient boosting trees? |
What is ReLU? How is it better than sigmoid or tanh? |
What is dropout? Why is it useful? How does it work? |
What is backpropagation? How does it work? Why do we need it? |
What is Adam? What’s the main difference between Adam and SGD? |
What is model checkpointing? |
What is transfer learning? How does it work? |
What is object detection? Do you know any architectures for that? Spring |
What is object segmentation? Do you know any architectures for that? Spring |
What is bag of words? How we can use it for text classification? |
What is TF-IDF? How is it useful for text classification? |
What is unsupervised learning? |
What is clustering? When do we need it? |
What is the curse of dimensionality? Why do we care about it? |
What is the ranking problem? Which models can you use to solve them? |
What is precision and recall at k? |
What is mean average precision at k? |
What is a recommender system? |
What is collaborative filtering? |
What is the cold start problem? |
What is a time series? |
What are the methods for solving linear regression do you know? |
What are MSE and RMSE? |
What are the decision trees? |
What are the main parameters of the decision tree model? |
What are the benefits of a single decision tree compared to more complex models? |
What are the main parameters of the random forest model? |
What are the potential problems with many large trees? |
What are the main parameters in the gradient boosting model? |
What are the problems with sigmoid as an activation function? |
What are augmentations? Why do we need them? What kind of augmentations do you know? How to choose which augmentations to use? |
What are the advantages and disadvantages of bag of words? |
What are N-grams? How can we use them? |
What are word embeddings? Why are they useful? Do you know Word2Vec? |
What are the other clustering algorithms do you know? |
What are good unsupervised baselines for text information retrieval? |
What are good baselines when building a recommender system? |
What are the problems with using trees for solving time series problems? |
What do we do with categorical variables? |
How do we check if a variable follows the normal distribution? |
How do we choose K in K-fold cross-validation? What’s your favorite K? |
How do we evaluate classification models? |
How do we select the right regularization parameters? |
How do we interpret weights in linear models? |
How do we train decision trees? |
How do we handle categorical variables in decision trees? |
How do we select the depth of the trees in random forest? |
How do we know how many trees we need in random forest? |
How do you approach tuning parameters in XGBoost or LightGBM? Spring |
How do you select the number of trees in the gradient boosting model? |
How do we use SGD (stochastic gradient descent) for training a neural net? |
How do we decide when to stop training a neural net? |
How do you do an online evaluation of a new ranking algorithm? |
How to validate your models? |
How to interpret the AU ROC score? |
How to set the learning rate? |
How to select K for K-means? |
Are CNNs resistant to rotations? What happens to the predictions of a CNN if an image is rotated? Spring |
Are there any differences between continuous and discrete variables when it comes to feature importance of gradient boosting models? Spring |
Can we have both L1 and L2 regularization components in a linear model? |
Can we use L1 regularization for feature selection? |
Can we use L2 regularization for feature selection? |
Can we formulate the search problem as a classification problem? How? |
Can you explain how cross-validation works? |
Can you tell us how you approach the model training process? |
Do we want to have a constant learning rate or we better change it throughout training? |
Do you know any other ways to get word embeddings? Spring |
Do you know how K-means works? |
Do you know how DBScan works? |
Do you know any dimensionality reduction techniques? |
Do you know how to use gradient boosting trees for ranking? Spring |
Feature importance in gradient boosting trees — what are possible options? |
How can we know which features are more important for the decision tree model? |
How can we use machine learning for text classification? |
How can you use neural nets for text classification? Spring |
How can we use CNN for text classification? Spring |
How can we use machine learning for search? |
How can we get training data for our ranking algorithms? |
How can we use clicks data as the training data for ranking algorithms? Spring |
How does L2 regularization look like in a linear model? |
How does a usual fully-connected feed-forward neural network work? |
How does max pooling work? Are there other pooling techniques? |
How is time series different from the usual regression problem? |
How L1 regularization looks like in a linear model? |
How large should be N for our bag of words when using N-grams? |
How we can initialize the weights of a neural network? |
How we can use neural nets for computer vision? |
How we can incorporate implicit feedback (clicks, etc) into our recommender systems? |
How would you evaluate your ranking algorithms? Which offline metrics would you use? |
If a weight for one variable is higher than for another — can we say that this variable is more important? |
If there’s a trend in our series, how we can remove it? And why would we want to do it? |
If you have a sentence with multiple words, you may need to combine multiple word embeddings into one. How would you do it? |
In which cases AU PR is better than AU ROC? |
Is accuracy always a good metric? |
Is feature selection important for linear models? |
Is it easy to parallelize training of a random forest model? How can we do it? |
Is it possible to parallelize training of a gradient boosting model? How to do it? |
Is logistic regression a linear model? Why? |
Possible approaches to solving the cold start problem? Spring |
Precision-recall trade-off |
What happens to our linear regression model if we have three columns in our data: x, y, z — and z is a sum of x and y? |
What happens to our linear regression model if the column z in the data is a sum of columns x and y and some random noise? |
What happens when we have correlated features in our data? |
What happens when the learning rate is too large? Too small? |
What if we want to build a model for predicting prices? Are prices distributed normally? Do we need to do any pre-processing for prices? |
What if instead of finding the best split, we randomly select a few splits and just select the best from them. Will it work? Spring |
What if we set all the weights of a neural network to 0? |
What kind of regularization techniques are applicable to linear models? |
What kind of problems neural nets can solve? |
What kind of CNN architectures for classification do you know? Spring |
What regularization techniques for neural nets do you know? |
What’s a convolutional layer? |
What’s pooling in CNN? Why do we need it? |
What’s singular value decomposition? How is it typically used for machine learning? |
What’s the normal distribution? Why do we care about it? |
What’s the effect of L2 regularization on the weights of a linear model? |
What’s the difference between L2 and L1 regularization? |
What’s the interpretation of the bias term in linear models? |
What’s the difference between random forest and gradient boosting? |
What’s the difference between grid search parameter tuning strategy and random search? When to use one or another? |
What’s the learning rate? |
When do we need to perform feature normalization for linear models? When it’s okay not to do it? |
When would you use Adam and when SGD? |
When would you choose K-means and when DBScan? |
Which feature selection techniques do you know? |
Which metrics for evaluating regression models do you know? |
Which model would you use for text classification with bag of words features? |
Which models do you know for solving time series problems? |
Which optimization techniques for training neural nets do you know? |
Which parameter tuning strategies (in general) do you know? |
Which regularization techniques do you know? |
Why do we need to split our data into three parts: train, validation, and test? |
Why do we need one-hot encoding? |
Why do we need randomization in random forest? |
Why do we need activation functions? |
Why do we actually need convolutions? Can’t we use fully-connected layers for that? |
Would you prefer gradient boosting trees model or logistic regression when doing text classification with bag of words? |
Would you prefer gradient boosting trees model or logistic regression when doing text classification with embeddings? |
You have a series with only one variable “y” measured at time t. How do predict “y” at time t+1? Which approaches would you use? |
You have a series with a variable “y” and a set of features. How do you predict “y” at t+1? Which approaches would you use? |
Define Python Panda operations followed in Data Science technology |
Define the structure of Artificial Neural Networks |
Define Back Propagation and its working process |
Define Lambda functions with example |
Define Power Analysis in R |
What are the types of machine learning? |
What is the Supervised learning in machine learning? |
What is the Unsupervised learning in machine learning? |
What are the commonly used python packages? |
What are the commonly used R packages? |
What is precision? |
What is recall? |
What is a normal distribution? |
What is overfitting? |
What is underfitting? |
What is a univariate analysis? |
What is the Pearson correlation? |
What is the common perception about visualization? |
What are the time series algorithms? |
What is the basic responsibility of a Data Scientist? |
What does SAS stand out to be the best over other data analytics tools? |
What is RUN-Group processing? |
What is the right way to validate the SAS program? |
What is means by precision and Recall? |
What is deep learning? |
What is the F1 score? |
What is the difference between Machine learning Vs Data Mining? |
What are confounding variables? |
What are the types of biases that can occur during sampling? |
What is alias in import statement? Why is it used? |
What is a nonparametric test used for? |
What are the pros and cons of Decision Trees algorithm? |
What are pros and cons of Naive Bayes algorithm? |
What are the types of Skewness? |
What is skewed data? |
What is the skewness of this data? 27 ; 28 ; 30 ; 32 ; 34 ; 38 ; 41 ; 42 ; 43 ; 44 ; 46 ; 53 ; 56 ; 62 |
What is an outlier? |
What are the applications of data science? |
What are the steps in exploratory data analysis? |
What are the types of data available in Enterprises? |
What are the various types of analysis on type of data? |
What is difference between primary data and secondary data? |
What is the difference between qualitative & quantitative ? |
What is histogram? |
What are the common measures of central tendancies? |
What are quartiles? |
What are the commonly used error metrics in regression tasks? |
What are the commonly used error metrics for classification tasks? |
What is it called when there are more than 1 explanatory variables in the regression task? |
What are residuals in a regression task? |
What are the main classifications in Machine learning? |
What are the main types of supervised learning tasks? |
What is R square value? |
What are some common ways of imputation? |
What is the difference between series and list |
What parameter is used to update the data without explicitly assigning data to a variable. |
What is the difference between a dictionary and a set? |
What is the function to create test train split? |
What is pickling? |
What is unpickling? |
What are the most common web frameworks of Python? |
What are lambda function in Python and how it is different from def (defining functionsin Python? |
What is your opinion on our current data process &nbs p;? |
What do you know by the term Normal Distribution? |
What is data visualization? |
What Is a System? |
What are the different benefits of language? |
What are the two main elements of the hottest architecture? |
What is Logistic Recession? |
What are the different features of the mechanical learning process? |
What Is Normal Distribution? |
What is Linear Recreation? |
What Is Interpolation and Extrapolation? |
What is Power Analysis? |
What is Q-Meaning? Can K Choose a K-Method? |
What is the recommended system? |
What is Linear Recreation? |
What is TFT / ITF Vectation? |
What is the Cluster Model? |
What is the regulatory model? |
What are the agenciers and the agenwals? |
What is a Pilab? |
What is PEP8? |
What is the monkey grafting in Python? |
What does it mean to understand the list? |
What is the output of the code below? |
What are the basic assumptions for linear backlash? |
What is the benefit of dimension reduction before applying an SVM? |
What is Data Science ? |
What are the skills required in Data Science ? |
What is Machine Learning ? |
What is the difference between Traditional Programming and Machine Learning ? |
What are the types of Machine Learning Algorithms? |
What are the main components of a data science project ? |
What percentage of time is usually required for each component in data science projects ? |
What is Artifical Intelligence ? |
What is Deep Learning (DL)? |
What is Backpropagation? |
What is Stochastic Gradient Descent? |
What is Data Science? |
What is the difference between iloc and loc activity? |
What package is used to import data from the Oracle server? |
What do the review process do? |
What do Dummies do? |
What is the curtain? |
What is the removal of data backward in advance? |
What is Unequal Data? |
What is standardization? |
What is Panda in Data Science and for which data it is suitable? |
What is a p-value? |
What is Data science? What is the role of Machine Learning in Data science? |
What you mean by Type I error and Type II error in Hypothesis testing? |
What is Logistic regression? How will you evaluate your Logistic regression model? |
What is the difference between ANOVA and t-test? |
What is the difference between Overfitting and Underfitting? |
What are the steps involved in an analytics project? |
What all are the main packages used in Python for Data science and Machine Learning? |
What are the assumptions required for linear regression? |
What is the difference between Covariance and correlation? |
What is Gradient descent? |
What is Regularization? |
What do you mean by Imbalanced Classes? |
What is bias – Variance trade off? |
What are the evaluation metrics in Classification algorithm? |
What are Ensemble, Bagging and Boosting? |
What are the skills are required to learn the data science with respect to python? |
What are the types of joins? |
What are the Types of Request database Flask allows? |
What are the Various Methods for Sequential Supervised Learning? |
What are the areas Pattern recognition is used. |
What are the supported data types in Python? |
What is Flask?Is flask equivalent to MVC Model? |
What are the types of Bias? |
What are the Different Data Structures in R? |
Name the commonly used algorithms. |
Name few methods for Missing Value Treatments. |
Name some Classification Algorithms. |
Name some Python Libraries used in Machine Learning . |
Name some supervised and unsupervised deep learning algorithms. |
Name some Python libraries used in Deep Learning |
Write code to sort a DataFrame in Python in descending order. |
Write a code using Panda |
Write syntax for creating sting variable? |
write the types of Techniques of Machine Learning? |
Write a query that returns the Details of each department and a count of the number of Students in each: |
write the types of Techniques of Machine Learning? |
Write a syntax, how you access a module written in Python from C |
Write the Components of relational evaluation techniques. |
Write the types of paradigms of ensemble methods? |
Write a syntax, how you access a module written in Python from C |
Write syntax for creating sting variable? |
write the types of Techniques of Machine Learning? |
Write program to convert uppercase little to lower case |
Explain about from capture of the correlation between continuous and categorical variable? |
Explain what is the regulation and why it is useful. Regularization? |
Explain the use of decorators? |
Explain supervised and unsupervised machine learning |
Explain Confusion Matrix |
Explain Normal Distribution |
Explain Covariance and Correlation in Data Science |
Explain Linear Regression |
Explain Collaborative Filtering |
Explain Python Dictionary |
Explain Auto Encoder |
Explain Rmarkdown |
Explain K-Means clustering? |
Explain why data cleaning is important in analysis ? |
Explain split(), sub(), subn(methods of “re” |
Explain the use of // Divisionoperator in Python? |
Explain about Sequence Learning? |
Explain that why data cleaning is important in analysis ? |
Why you should use NumPy arrays instead of nested Python lists? |
Why is an import statement required in Python? |
Why is data important in data analysis? |
Why data analysis is an important part of the analysis? |
Why is a useful metric? |
Why we need to use a python tuple is preferred over python list ? |
Which metric acts like accuracy in classification problem statement? |
Which Python library is used for data visualization? |
Which function is used to get descriptive statistics of a dataframe? |
Which function can be used to filter a DataFrame? |
Which language is suitable for text analysis? R or Python? |
Which tool should you use to find the bugs? |
Which Python Library is used by Machine Leader? |
Which symbol is used to add a comment in R language? |
How and by what methods data visualizations can be effectively used? |
How to understand the problems faced during data analysis? |
How to choose the right chart in case of creating a viz? |
How can I achieve accuracy in the first model that I built? |
How do I enhance a SAS analyst? |
How is F1 score is used? |
How can you randomize the items of a list in place in Python? |
How to get indices of N maximum values in a NumPy array? |
How make you 3D plots/visualizations using NumPy/SciPy? |
How to access a specific script inside a module? |
How to create a series with letters as index? |
How to convert n number of series to a dataframe? |
How to select a section of a dataframe? |
How are exceptions handled in Python? |
How to differentiate from KNN and K-means clustering? |
How to Clean Data is an Important Part of the Process? |
How do Data Scientists use statistics? |
How is Machine Learning Used in Real World Scenes? |
How does data modeling change from database format? |
How to sort items of the list in Python? |
How do you see if a panda data information is empty or not? |
How to Assign Code to the List? |
How to read an Excel file without a file file in the Byndah? |
How often you should update an algorithm? |
How will you define supervised and unsupervised learning? |
How will you evaluate your regression model based on R2, Adjusted R2 and tolerance? |
How will you define your number of clusters in K-MeAnswer: clustering algorithm? |
How kNN is different from K-MeAnswer: clustering? |
How gradient descent is helpful in ML? |
How would you create an empty NumPy array? |
How would you make a Python script executable on Unix? |
How would we can create an empty NumPy array? |
How will you reverse a list? |
How will you remove last object from a list? |
How to find the best approximate solution to the knapsack problem1 in a given time by using best Algorithm |
Where to seek help in case of discrepancies in Tableau? |
Where we are mostly using naiveBayes algorithm for classification? |
Who is a Data Scientist ? |
Difference between supervised and unsupervised machine learning? |
Difference between Machine learning and Data Mining? |
Difference between an Array and a Linked list? |
Difference between “long” and “wide” format data? |
Difference between distinct, bivariate and multivariate analysis? |
Difference between Supervised and unsupervised? |
In R, how will you load a .csv file? |
In R Language, provide the usage of Next statement |
Advantages of Tableau Prep? |
Algorithm for a sorting a number dataset in Python. |
Are the aliases used for a module fixed/static ? |
Can Random forest be used for classification and regression? |
Can the values be replaced in tuple? |
Can you briefly describe the scientific method? |
Can you quote some examples of false positives that are more false than negative ones? |
Can you cite some of the worst negative examples of negative negative than negative ones? |
Can you quote some examples of both false positives and misinformation? |
Can you explain the difference between a verification set and test set? |
Can the formula be written to calculate the R-square? |
Can you provide sample code for creating a data frame in order to perform slicing in Panda? |
Can you explain few things about ShinyR? |
Can you perform some comparison on R and Python which is useful in Data Science? |
Can you write a R programming code? |
Cite an example where both false negative and false positives has equal importance |
Compare SAS, R, and Python programming? |
Data Science Carrier |
Data Science job areas |
Definitions of is BY-Group processing? |
Describe univariate, bivariate and multivariate analysis? |
Describe the feature selection approaches that is used to pick the correct variables |
Describe Batch, and Epoch in Deep Learning |
Describe LSTM |
Differences between overfitting and underfitting |
Differentiate between univariate, bivariate and then multivariate analysis. |
Differentiate between univariate, bivariate and multivariate analysis? |
Do you know any SAS functions and Call Routines? |
Do you explain the word Botnet? |
Do you know the various components of graphics grammar in R? |
Explian Naïve – Bayes algorithm? |
For a categorical variable, what is the process to check frequency distribution? |
Give examples of supervised and unsupervised ML algorithms. |
Give me two important tasks in the pants? |
Give me the steps for an analytics project |
Give an example of optimizing a python code |
Give us a pictorial representation of the Decision Tree algorithm in Data Science |
Give example for unzipping. |
If you provide employees’ first and last names, what type of data in Python stores them? |
Import of Flat File / CSV in Baidan |
Is multiprocessing possible in python? |
Is it possible to merge two (2data frames in R? If yes, how is that done? |
List out the different classification algorithms |
List out the Kernel functions available in SVM |
List out few functions that are available in dplyr package |
List out the Supervised Learning Functions. |
list having tweets, find 10 most used top hashtags. |
Mention the characteristics of symmetric data distribution? |
Mention few important skills to contain in Python with respect to Data Analytics |
Mention the functions that are used to copy objects in Python |
Mention Any Five Algorithms of Machine Learning. |
Mention the Different types of sequence learning process? |
Now companies are heavily investing their money and time to make the dashboards. Why? |
Program for one-linear that will count the number of capital letters in a file. |
Provide the Life cycle of Data Science |
Provide the technical concepts handled in Supervised, Unsupervised and Reinforcement Learning |
Provide the different types of Biases that occur during Sampling and give a one-line definition for each type. |
Provide the various layers of CNN |
Provide the Machine Learning libraries along with its benefits |
Provide the basic steps to create a new R6 class |
Provide an example for False Positive in Data Science |
Provide the various Deep Learning frameworks |
Scope and Applications of Statistics |
The Difference Between Data Modeling and Database Design? |
The performance of the K modular system? |
What’ s the difference between a Regression and a Classification problem? |
You should find that data is stored in HDFS format and how the data is structured. Which command should you use to identify the names of HDFS keys? |
You are given a dataset and you have build a decision tree model on top of it. You got an accuracy of 98%. Why you shouldn’t happy with your model performance? |
01 January 2021
#Data_Science
Subscribe to:
Post Comments (Atom)
Most views on this month
-
#Exception_Handling Question count - 0 Last updated - V4 (15-Dec-2024) Interview question What is an Exception in Java? ...
-
#Spring_Batch Question count - 0 Last updated - V2 (12-Jul-2024) Interview question Define A Job In Spring Batch Batch?...
-
#JDBC Question count - 0 Last updated - V4 (19-Dec-2024) Interview question What transaction insulation levels are suppo...
-
#CoreJava Question count - 0 Last updated - V4 (08-Aug-2024) Interview question What are the main features of Java? Wha...
-
#Spring_MVC Question count - 0 Last updated - V4 (15-Aug-2024) Interview question What is MVC? What is Spring MVC? What...
-
#TypeScript Question count - 0 Last updated - V4 (02-Sep-2024) Interview question What are Ambients in TypeScripts and w...
-
#Javascript Question count - 0 Last updated - V3 (01-Aug-2024) Interview question What are all the data types JavaScript...
-
Python is a popular and most powerful programming language, created by Guido van Rossum, and released in 1991. It is a interpreted,...
-
#Apache_Kafka Question count - 0 Last updated - V4 (07-Aug-2024) Interview question Define the role of Kafka Streams API...
-
#Apache_Flume Question count - 0 Last updated - V4 (02-Sep-2024) Interview question What is Flume? What is Apache Flume...
No comments:
Post a Comment