Dr. Arun Kumar Pandey

Typing SVG

About

Passionate data enthusiast with a strong technical background. Committed to using data-driven approaches for innovative problem-solving and decision-making.

Profile Image

Data Consultant & Data Engineer

Here is some details about me:

  • Birthday: July 12
  • Academic Website: Click here
  • Phone: +49-XXXX-XXX-XXXX
  • Degree: Ph.D. in Astrophysics (Cosmology)
  • Email: arunp77@gmail.com / arun-kumar.pandey@soprasteria.com
  • Current City: Mainz, Germany

Resume

Summary

Profile

Astrophysicist turned 'Data Consultant | Data Engineer' with over 5 years of postdoctoral research experience from leading international organizations. Currently working as a Data Engineer, specializing in advanced Python, web scraping, machine learning, deep learning, and data engineering. Skilled in building ETL pipelines, data streaming, and creating APIs. Proficient in a wide range of technologies including SQL, MongoDB, Elasticsearch, AWS, Snowflake, Apache NiFi, Docker, and Kubernetes. Passionate about leveraging data analytics and engineering to drive insightful and impactful solutions.

  • Mainz, Rheinland-Pfalz, Germany
  • +49-XXXX-XXX-XXXX
  • arunp77@gmail.com

Diploma Courses

Bootcamp: Data Engineering

Dec 2023 - May 2024

University of Paris I: Panthéon-Sorbonne (Data Scientist, DST Germany GmbH, Berliner Straβe 56, 77694 Kehl)

Learning objectives: Completed a comprehensive Data Engineering Bootcamp, mastering essential technologies and tools.

  • Create, customize, and manage ETLs to execute data projects
  • Databases, data warehousing, data lake
  • Python for advanced databases (SQL, NoSQL), Big Data (Hadoop, Spark), Git, GitHub/Gitlab, CI/CD (Jenkins, Github action), API (Flask, FastAPI), Airflow, Docker, Kubernetes, Web scraping, Docker containerization, Project management

Education

Ph.D. in Astrophysics (Cosmology)

2017

IIT-Gandhinagar, India

Master of Science (M.Sc.) in Physics

D.D.U. Gorakhpur University, Gorakhpur, India

Focused on electronics.

Bachelor of Science (B.Sc.) in Physics, Mathematics, Electronics

D.D.U. Gorakhpur University, Gorakhpur, India

Focused on electronics.

Professional Experience

Data Consultant | Data Engineer

January 2023 - Present (Full-time)

Mainz, Rheinland-Pfalz, Germany

  • Created and deployed a scalable API using FastAPI, Elasticsearch, and Docker for analyzing diverse job market data sources. Facilitated seamless data extraction, transformation, loading, and execution of predefined analytical tasks.
  • Developed expertise in remote sensing through an extensive training program provided by EUMETSAT, gaining skills in data access and retrieval, fundamentals of remote sensing, and Python data analysis of multidimensional meteorological datasets.
  • Developed and deployed a dynamic web application using Python Flask, showcasing proficiency in Flask fundamentals and template utilization.
  • Engineered data analytics solutions for Spotify's global song dataset, ensuring secure data handling and achieving streamlined project management.
  • Architected end-to-end ETL pipelines leveraging Docker, Jupyter Notebook, Apache NiFi, and Snowflake, facilitating real-time data flow to AWS S3.
  • Built data warehouses using Snowflake and created data pipelines using Apache Airflow on AWS.
  • Championed Dimension Modeling techniques, executed Linux system operations with precision, and conducted impactful risk analysis through Monte Carlo simulations.
  • Managed an M&A project during a JP Morgan job simulation, identifying ideal targets, building a DCF model, and creating a client-facing document for auction support.
  • Led data visualization projects for TATA Industries during job simulations, translating complex data into tailored visuals that aligned with business needs and effectively communicated insights to diverse audiences.
  • Conducted extensive risk analysis using Monte Carlo simulations for market and portfolio risk management, quantifying potential losses under various scenarios.
  • Published blogs on science, finance, data science, and analytics, showcasing expertise and thought leadership.

Visiting Scientist

June 2022 - July 2022 (Stipendium)

Max-Planck-Institut für Astrophysik, Garching bei München, Germany

  • Led seamless project management and teamwork using Git for agile code control.
  • Spearheaded groundbreaking Python-based research ventures, dissecting vast datasets.
  • Developed Python algorithms to quantify magnetic field effects on CMB distortions, lensing, galaxy enumeration, dark-age 21cm power, and matter spectra.

Postdoctoral Research Scientist

July 2020 - December 2020 (Full-time)

Hangzhou Institute for Advanced Studies, Hangzhou, Zhejiang, China

  • Studied the Effects of gravitational waves on electromagnetic fields.
  • Explored gravitational wave generation mechanisms in hot dense neutrino plasmas and their potential explanation for the NANOGrav signal, including constraints on magnetic field strength, which results in one publication in a respected international journal.

D.S. Kothari Postdoctoral Research Scientist

June 2019 – June 2020, January 2021 - May 2022 (Full-time)

Department of Physics and Astrophysics, University of Delhi, India

  • Designed models and formulated algorithms: Created Python code to dissect the global temperature and ionization history of the universe. Evaluated the impacts of exotic energy sources like dark matter annihilation or decay.
  • Conducted detailed mathematical calculations and solved complex equations on gravitational collapse of hot gaseous plasma in magnetic fields using Python.

Student Teaching Assistant

July 2019 – December 2019

Department of Physics and Astrophysics, University of Delhi, India

  • Worked as a teaching assistant to the Masters student (M.Sc.) Mathematical Physics.

Postdoctoral Research Fellow

June 2017 – June 2019 (Full-time)

Physical Research Laboratory, Ahmedabad, India

  • Innovative Model Building: Studied primordial field generation in the early universe, contributing to a deeper understanding of cosmic magnetism.
  • Skilled in overseeing High-Performance Computing (HPC) systems for extensive computational work. Expertise in automating and optimizing data-driven operations in intricate networked environments.

Portfolio

Welcome to my portfolio, a hub of knowledge and expertise spanning various domains in data science, programming, and scientific research. In this repository, I have curated a diverse collection of resources designed to help learners, enthusiasts, and fellow professionals embark on a journey of continuous learning and skill development. Explore the following sections to access a wealth of materials, from Python programming and machine learning to real-world data analysis projects and best practices in version control. Additionally, you'll find valuable courses related to EUMETSAT, scientific code for gravitational collapse simulations, and documentation resources in LaTeX, MS Word, and more. Whether you're a budding data scientist, a seasoned developer, or a researcher seeking valuable insights, these resources are here to empower your growth and expand your horizons (related codes are Available at my Github repository).

  • All
  • Data analytics
  • Machine-Learning
  • Monte Carlo Simulation
  • Academic codes
  • Data engineering
  • Remote sensing

Skills and Tools

Here are some of the skills I specialize in:

  • Python: (Numpy, Pandas, Seaborn, Matplotlib, Statsmodels, Scipy, Plotly, scikit-learn, xarray, Satpy, GeoPandas)
  • Machine Learning: (Supervised Learning: Linear, Polynomial & Logistic Regression, Decision Trees, K-nearest Neighbors (KNN), Support Vector Machines (SVMs), Random Forest, Naive Bayes, Gradient Descent), (Unsupervised Learning: Principal Component Analysis (PCA)), ARIMA, TensorFlow, Maximum Likelihood Estimation
  • Artificial Inteligence: (Natural Language Processing (NLP))
  • Version Control: (Git, GitHub, GitLab)
  • Database Management: (SQL, BigQuery, MySQL, PostgreSQL, Elasticsearch, MongoDB, Luna Data Modeler)
  • Data Engineering: (Data Warehouse, Data Lake, Snowflake, Apache Airflow, Kafka, FastAPI, Flask, CI/CD pipeline, Dash, Unit Test, ETL/ELT Processes, Atlassian Tools (Jira, Confluence, Trello), Amazon Web Services (AWS), Redshift)
  • Scripting: (Bash, Shell)
  • Web Development: (HTML & CSS)
  • Software and Tools: (VSCode, API Integrations, Virtual Machine, Docker Containerization, Bitbucket, Mathematica)
  • Operating Systems: (Linux Environments, Windows, Mac-OS)
  • Simulation: (Monte Carlo Simulation, OpenMPI)
  • Dashboard Tools: (MS Excel, Power BI, Tableau, Looker Studio)
  • Documentation: (LaTeX, MS Word, Mac Pages)

Other Skills

Here are some additional skills and tools I specialize in:

  • Level-0 to 2 Meteorological Datasets
  • Remote Sensing
  • Time Series Analysis
  • ETL (Extract, Transform, Load) Process
  • Anomaly Investigation
  • Statistical Analysis
  • Data Mining
  • Data Modeling
  • Predictive Analytics
  • Quick Learner
  • Data Pipeline on Airflow
  • Cross-Functional Collaboration
  • Highly Organized
  • Problem-Solving Abilities
  • Communication Skills
  • Detail Oriented
  • Fluent in English
  • Quality Control and Validation

Technical Skills

With my years of work in the research field, I have acquired the following technical skills:

Python
Machine Learning
Data Engineering
NLP
SQL
Visualization
HTML/CSS
Documentation

Languages

English Fluent
Hindi (Native)
German (A2)
Bhojpuri (Native)
Chinese (Beginner)

Awards / Fellowship / Recognition

  1. D. S. Kothari Post Doctoral Fellowships (DSKPDF) in Sciences, UNIVERSITY GRANTS COMMISSION (UGC)
    • Year: April 2019
    • Grant number: (BSR) PH/18-19/0070
    • I was one of only 15 candidates selected nationwide for the April 2019 cycle.
    • Subject: Astrophysics
  2. CSIR-Junior Research Fellowship (CSIR-JRF)
    • Year: June 2011
    • Qualified for the National Eligibility Test (NET) for Lectureship (June, 2011),Conduct by CSIR (Council Of Scientific And Industrial Research), & University Grant Commission (UGC) under Ministry of Human Resource Development Organization India for doctorate fellowship.
    • Subject: Physics
    • Rank:All India Rank 33 (CSIR-JRF).
  3. Graduate Aptitude Test (GATE)
    • Year: March 2011
    • Qualified for the Graduate Aptitude Test in Engineering (GATE), Conducted by Indian Institutes of Technology (IITs) and Indian Institute of Science (IISc) on behalf of the National Coordination Board – GATE, Department of Higher Education, Ministry of Education (MoE), Government of India.
    • Subject: Physics
    • Percentile: 98.39
    • Rank: All India Rank 107.
  4. Fellowship for the Doctoral Studies & Research
    • Year: July, 2011 – Jun, 2016
    • Physical Research Laboratory Ahmedabad, India, Department of Space, Government of India
    • Subject: Theoretical Physics / Cosmology / Astrophysics
  5. Fellowship for the Doctoral Studies & Research
    • Year: June, 2011
    • The Institute for Plasma Research (IPR ), Gandhinagar, India, Department of Atomic Energy (DAE), Government of India
    • Research area offered: ITER-India program

Professional Courses & Certification

  • Operational Satellite Oceanography Workshop
  • Nov 2023 – Dec 2023

    • Visualise CoastWatch data, Use command line tools to perform data extractions from CoastWatch products.
    • Access data operationally through the EUMETSAT Data Store APIs and EUMDAC client.
    • Conduct batch processing using SNAP and supporting Jupyter notebooks.
    • Extract and analyse in situ matchups with Sentinel-3 ocean colour data using ThoMaS - a Tool to generate Matchups of OC products with Sentinel-3/OLCI.
    • Customized Jira configurations for project-specific needs, improving team productivity.
    • Work with GOCI-II data.
  • EUMETSAT Data Access Services & European Weather Cloud
  • Sep 2023 – Dec 2023

    • Completed an extensive course on remote sensing provided by EUMETSAT, gaining a strong foundation in data access and retrieval, the fundamentals of remote sensing, and Python data analysis of multidimensional meteorological datasets.
    • Developed skills in using remote sensing data to analyze and monitor environmental phenomena such as climate change, weather patterns, and natural disasters.
    • Learned to apply Python programming language to process and analyze remote sensing data, extracting valuable insights and information.
    • Gained experience in working with a variety of remote sensing datasets, including satellite imagery and ground-based data.
  • Data Warehouse for Data Engineering with Snowflake
  • Sep 2023 – Sep 2023

    • Mastered fundamentals of Data Warehouses.
    • Gained in-depth knowledge of Dimension Modelling, including E-Commerce Dimension Modelling.
    • Learned about Slowly Changing Dimension techniques.
    • Acquired skills in Extract Transform Load (ETL) processes.
    • Completed a project on Spotify Data Pipeline using Snowflake, AWS, and Python.
    • Implemented Real-Time Data Streaming using AWS, Snowflake.
    • Developed expertise in creating pipelines using Apache Airflow.
  • Cloud Platform, Secure a cloud-based application at Verizon (virtual simulation)
    • Completed a job simulation involving building a hypothetical new VPN product for Verizon’s Cloud Computing team.
    • Used command line Python to test whether Verizon’s VPN met the cloud-native traits, i.e. redundancy, resiliency, and least privilege.
    • Researched approaches to achieve application security and communicated insights in a PowerPoint Presentation.
  • JPMorgan Chase Investment Banking Virtual Experience Program on Forage
  • Sep 2023

    • Identified an ideal M&A target for a client based on an assessment of their strategic and financial criteria.
    • Constructed a DCF model to calculate the valuation of the M&A target and adjusted the model to account for a competitor bid and supply chain interruption.
    • Created a 2-pager for the client containing a company profile and summary of the auction process.
  • Tata Data Visualisation: Empowering Business with Effective Insights
  • Aug 2023

    • Proactively addressing vital business inquiries through tailored data visualization and interpretation for leaders.
    • Proficiently selecting the most suitable visuals, such as charts and graphs, for effective communication of complex data.
    • Specializing in crafting impactful visuals that align with business requirements, including expertise in data visualization, dashboard development, and data refinement.
    • Skillfully conveying insights to diverse audiences and providing clear explanations of data's significance in various contexts.
  • Machine Learning with Python: Supervised learning
  • Feb 2023 – Mar 2023

    • Regression models
    • Classification model: K-nearest neighbor
  • Data Analysis with Python: Zero to Pandas
  • Oct 2022 – Dec 2022

    • Numpy, Pandas, Seaborn, Matplotlib, Plotly, Statsmodel, Scipy, Sklearn

Research publications

Here are some of my research publications:

ORCID iD iconMy Orcid id

  1. Primordial Magnetic field and kinetic theory with Berry curvature, Jitesh R. Bhatt, Arun Kumar Pandey [arXiv:1503.01878 [astro-ph.CO]] Phys.Rev.D. 94, 043536
  2. Primordial Generation of magnetic field, Jitesh R. Bhatt, Arun Kumar Pandey [arXiv:1507.01795 [gr-qc]] Springer Proc.Phys. 174 (2016) 409-413
  3. Effect of background magnetic field on the normal modes of conformal dissipative chiral hydro and a novel mechanism for explaining pulsar kicks Arun Kumar Pandey, Manu George [arXiv:1609.01848 (astro-ph.CO)]
  4. Chiral Battery, scaling laws and magnetic fields, Sampurn Anad, Jitesh R. Bhatt & Arun Kumar Pandey [arXiv:1705.03683 (astro-ph.CO)] JCAP, JULY 2017
  5. Chiral Plasma Instability and Primordial Gravitational waves, Sampurn Anad, Jitesh R. Bhatt & Arun Kumar Pandey [arXiv:1801.00650 [astro-ph.CO] (2019)] Eur. Phys. J. C (2019) 79: 119.
  6. Baryon-Dark matter interaction in presence of magnetic fields in light of EDGES signal, Jitesh R Bhatt, Pravin Kumar Natwariya, Aleka C. Nayak, Arun Kumar Pandey [arXiv:1905.13486 [astro-ph.CO] (2019)], Eur. Phys. J. C 80, 334 (2020)
  7. Viscosity in cosmic fluids, Jitesh R Bhatt, Pravin Kumar Natwariya, Arun Kumar Pandey, arXiv:1907.03445 [astro-ph.CO] (2019) Eur. Phys. J. C 80 (2020) 8, 767
  8. Magnetic fields in a hot dense neutrino plasma and the Gravitational Waves Arun Kumar Pandey, Pravin Kumar Natwariya, Jitesh R Bhatt, arXiv:1911.05412 [astro-ph.CO] (2020), Phys. Rev. D 101, 023531 (2020)
  9. Implications of baryon-dark matter interaction on IGM temperature and tSZ effect with magnetic field, Arun Kumar Pandey, Sunil Malik, T. R. Seshadri, arXiv:2006.07901 [astro-ph.CO] (2020), Mon.Not.Roy.Astron.Soc. 500 (2020)
  10. Gravitational waves in neutrino plasma and NANOGrav signal, Arun Kumar Pandey arXiv:2011.05821 [astro-ph.CO] (2020) [Eur.Phys.J.C 81 (2021) 5, 399]
  11. Generating Seed magnetic field à la Chiral Biermann battery, Arun Kumar Pandey, Sampurn Anand (Phys. Rev. D. 104, 063508 (2021))
  12. Thermal SZ effect in a magnetized IGM dominated by interacting DM decay/annihilation during dark ages, Arun Kumar Pandey, Sunil Malik (2022) [arXiv:2204.08088]

Contact

You can contact me at my email: arunp77@gmail.com.

Location:

Mainz, Germany

Call:

+49 XXXX-XXX-XXXX

SCAN ME