When you're learning data science, you usually practice with nice, clean, pre-packaged data sets and tidy case studies that lead you step-by-step from data collection to cool insights.
But when real life hits, many data scientists have to work with missing or sketchy information extracted from (multiple) sources in the organization. Data science that works is a messy, trial-and-error process of creating and testing hypotheses, gathering evidence, and drawing conclusions.
Going Pro in Data Science: What It Takes to Succeed as a Professional Data Scientist, by distinguished CSC engineer Jerry Overton, outlines practices for making good decisions in the complicated real world. These skills are far more useful for practicing data scientists than, say, mastering the details of a machine-learning algorithm.
It's an incredibly practical ebook. And it's free.
Enjoy!

Download the free ebook → http://www.oreilly.com/data/free/going-pro-in-data-science.csp?imm_...

Ben Lorica
Chief Data Scientist, O'Reilly Media
P.S. Jerry Overton is also presenting a half-day tutorial on the topic at Strata + Hadoop World in NY in September, providing in-depth education in data science, big data architecture, and analytics for business. As an O'Reilly customer, get 30% off Early Price with code DATA30 by registering by August 12.

Views: 142

Comment

You need to be a member of Codetown to add comments!

Join Codetown

Happy 10th year, JCertif!

Notes

Welcome to Codetown!

Codetown is a social network. It's got blogs, forums, groups, personal pages and more! You might think of Codetown as a funky camper van with lots of compartments for your stuff and a great multimedia system, too! Best of all, Codetown has room for all of your friends.

When you create a profile for yourself you get a personal page automatically. That's where you can be creative and do your own thing. People who want to get to know you will click on your name or picture and…
Continue

Created by Michael Levin Dec 18, 2008 at 6:56pm. Last updated by Michael Levin May 4, 2018.

Looking for Jobs or Staff?

Check out the Codetown Jobs group.

 

Enjoy the site? Support Codetown with your donation.



InfoQ Reading List

Presentation: Fix SLO Breaches Before They Repeat: An SRE AI Agent for Application Workloads

Bruno Borges discusses a paradigm shift in performance management: moving from manual tuning to automated SRE agents. He explains how to leverage the USE and jPDM methodologies alongside LLMs to reduce MTTR from hours to seconds. By utilizing MCP tools for real-time diagnostics and memory dump analysis, he shares how engineering leaders can scale systems while meeting strict objectives.

By Bruno Borges

AWS Expands Well‑Architected Guidance with Data Residency and Hybrid Cloud Lens

Earlier this year, AWS launched the Well-Architected Data Residency with Hybrid Cloud Services Lens, providing guidance for hybrid cloud workloads. The lens covers data classification, operational practices, automation, and compliance, helping organizations manage data location while optimizing security, cost, and resilience.

By Leela Kumili

SIMA 2 Uses Gemini and Self-Improvement to Generalize Across Unseen 3D and Photorealistic Worlds

Google DeepMind researchers introduced SIMA 2 (Scalable Instructable Multiworld Agent), a generalist agent built on the Gemini foundation model that can understand and act across multiple 3D virtual game environments. The SIMA 2 architecture uses a Gemini Flash-Lite model trained on a mixture of gameplay and Gemini pretraining data.

By Vinod Goje

Article: Stop Guessing, Start Improving: Using DORA Metrics and Process Behavior Charts

Delivery performance rarely changes in a straight line. Small degradations caused by tooling, environment instability, or team changes can accumulate quietly, while real improvements take time to emerge. This article shows how combining DORA metrics with Process Behavior Charts helps teams zoom out, detect meaningful shifts early, and validate improvement hypotheses.

By Egor Savochkin

SharePoint Framework 1.22 Ships with Heft-Based Build Toolchain and Refreshed Project Baseline

Microsoft has announced the general availability of SharePoint Framework (SPFx) version 1.22, a release centered on modernising the build and tooling experience for SPFx developers. This shift marks a foundational update to how SPFx solutions are built, aimed at addressing technical debt, improving extensibility, and aligning with broader Microsoft toolchain standards.

By Edin Kapić

© 2025   Created by Michael Levin.   Powered by

Badges  |  Report an Issue  |  Terms of Service