Tiger Analytics:
1st round:
Online test on:
Python coding😀
SQL coding
2nd round:
1. You have 30records with 10 features. You made 10 trees(n_estimators).Data is divided into 80/20 ratio(train/test set). After one epoch, Training accuracy you got 80% but test accuracy you got 60 %. Now what would you do to handle this.
2. In share trading, a buyer buys shares and sells on a future date. Given the stock price of n days, the trader is allowed to make at most k transactions, where a new transaction can only start after the previous transaction is complete, find out the maximum profit that a share trader could have made.
3. you have 25 cards of 5 different color. what is the probability if picking 2 different color cards.
4. Probability of - atleast 1 plane goes away every 5 min = 15%. What is the probability of atleast 1 plane go every 30 minute?
5. What is the probability that a point will lie close to circumference of the circle than to center.
6. There is one categorical feature which is having 3 categories. How do you deal with this?(then follow up question when I told about dummies) How do you get the dummies? How the metrics gets affected.
3rd round(DS final round):
1. Find the largest cumulative sum from a list having -ve and +integers.
2. How to draw AUC graph?
3. Case Study - You work in Hyundai(car manufacture company). The company is giving 20% discount(in terms of low interest loan rate, etc). How this will impact on sales?(Give the approach and approx sales figure by assuming some sales figure pre-discount)
4. Resume discussion
4th Round(MLOps final round):
1. What is the difference between processing and threading?
2. What is multi-threading?
3. Questions on Spark.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Alstom:
1st round:
1. plot graph of y = sin(x). Details are given. All the dependent libraries will be provided prior only for any ML related coding.(live coding round on JN)
2. Given data for Stock market, Use Linear regression to predict the test data. Also use PCA to reduce down to 2 features(out of 5).
3. (RF algo understanding) Suppose there are 25 trees & each tree has error of 0.25, Then what will be total error.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Legato:
1st round:
1. How do you handle imbalance dataset with huge data?
2. Can we handle imbalance via XGBoost(I don't want to use any external technique like SMOTE to handle it)?
3. Difference between List comprehension and normal loop.
4. Difference between lambda and normal function.
5. Determine the output(usage of list append was there. )(I got some .py code and I need to predict the o/p)
6. Determine the output.(Checking Class and object concept)
7. Difference between immutability and mutability.
8. Some direct Questions on Statistics.
9. Basic Knowledge on Hadoop and Pyspark.
10. You have 10000 records and 10 features which are linearly separable(classification problem).Which is better to use: logistic Regression or linear SVC?
11. Resume discussion
2nd Round:
Online test: McQ based on ML, NLP, DL, CV.
Python coding problem.
SQL coding problem.
3rd Round:
1. Plotting AUC curve.
2. What is regularization?
3. You have thief dataset which contains features like Race, Age, Sex, Caste(sensitive attributes). you need predict if the person is thief or not. your model is always predicting that black person is thief(and model is predicting less for white person of becoming thief). why do you think the model is doing like that?(this questions is ultimately trying to ask about pre and post bias analysis. also about SHAP values)
3. Basic knowledge about Explainable AI.
4. there is a dataset of 100k records. you have derived a new feature which is useful but there are only 20k records in that column. How to handle this.
5. what are the difference statistics test available. explain few.
6. You need to predict if a patient needs a knee operation or not. In this case, tell which error is worse: type1 or type 2. also explain both.
7. basics questions on different visualizations tools.
8. what is hyperparameter tuning. what are the ways of hyperparameter tuning can be done.
9. resume discussion
4th round:
Managerial round:
1. Resume discussion solely(very less technical).
2. Knowledge on the company domain for which you are getting interviewed.
3. i also asked some of my own doubts/questions to the interviewer.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------