PipeCandy is a 'one of its kind', 'data science' driven market intelligence platform that tracks the global eCommerce landscape. Our insights are used by well known global brands and startups. We are venture funded by India, the US, and Singapore based investors. We are building a complex data product that aims to revolutionize industry intelligence by applying sophisticated machine learning & AI algorithms on millions of data points.
We are looking for a Data Scientist to ensure that the quality of this product always stays top-notch and world-class. If you love working with data, have an eye for detail and a strong adherence to quality then we’d love to hear from you.
This is a senior position where the analyst will work under the general direction of the Chief Data Scientist and senior staff in the Data Management team. The primary responsibility is to treat data as an asset and become the expert source for data standards and policy-related questions.
- Manage the creation, deployment, and maturity of data governance processes and technology including master data, metadata, and data quality initiatives
- Identify opportunities to ensure transparent, high-quality data across sources and platforms
- Review, clean and add business records and formatting rules to every taxonomy/hierarchy in the product database in support of long-term data governance.
- Develop processes and tools for data cleansing, de-duplicating, and other data preparation, standardization, and transformation
- Collaborate with various teams to standardize data and ensure adherence to data ingestion and governance standards
- Conduct root cause analysis and proposed improvement solutions
- Leverage subject matter expertise to ensure data products are understood by the business users
This position requires a proficient level of experienced analytical and programming capabilities, defining requirements, developing and/or maintaining computer applications/systems, and ability to meet business needs within deadlines.
- Works to develop analytical/ data mining/ machine learning models using Python, R and other tools
- Gather, evaluate and document requirements, ability to build an algorithm (statistical/ data mining/ machine learning) based on requirements and specifications provided
- Works with data and is able conceptualize and improvise analytical solutions to problems
- Ability to deploy analytical algorithms within a larger business application
- Ability to visualize data and results of data analysis & analytical models
- Create model documentation as per client/ regulatory standards
- 1+ years of total relevant experience
- Degree in a quantitative field (Math, Statistics, Economics, Physics, and/ or Engineering, MBA)
- Ability to work with business and technology teams to build and deploy an analytical solution as per needs
- Ability to multi-task, solve problems and think strategically
- Strong communication and collaboration skills
- Experience with statistical analysis using R and Python. Experience with Spark and ML as plus
- Good experience in data discovery, exploration and algorithm development
- Experience with working on large data sets and developing scalable algorithms
- Hands-on experience of machine learning and data mining algorithms such as decision trees, classifiers, text mining/ NLP, clustering, and regression
- Exp in SAS, SPSS, or scripting languages such as Java a plus
- Knowledge of Hadoop and other distributed computing platforms
- Broad knowledge of data mining, NLP algorithms, machine learning algorithms and other techniques technologies
- Strong analytical and problem solving skills
- Excellent presentation and communication skills
- Flat organization structure with an opportunity to work very closely with the founders
- Access to learning, training sessions outside of your immediate line of work
- Access to group kindle account with latest titles
- Stocked pantry, of course