When you're learning data science, you usually practice with nice, clean, pre-packaged data sets and tidy case studies that lead you step-by-step from data collection to cool insights.
But when real life hits, many data scientists have to work with missing or sketchy information extracted from (multiple) sources in the organization. Data science that works is a messy, trial-and-error process of creating and testing hypotheses, gathering evidence, and drawing conclusions.
Going Pro in Data Science: What It Takes to Succeed as a Professional Data Scientist, by distinguished CSC engineer Jerry Overton, outlines practices for making good decisions in the complicated real world. These skills are far more useful for practicing data scientists than, say, mastering the details of a machine-learning algorithm.
It's an incredibly practical ebook. And it's free.
Enjoy!

Download the free ebook → http://www.oreilly.com/data/free/going-pro-in-data-science.csp?imm_...

Ben Lorica
Chief Data Scientist, O'Reilly Media
P.S. Jerry Overton is also presenting a half-day tutorial on the topic at Strata + Hadoop World in NY in September, providing in-depth education in data science, big data architecture, and analytics for business. As an O'Reilly customer, get 30% off Early Price with code DATA30 by registering by August 12.

Views: 137

Comment

You need to be a member of Codetown to add comments!

Join Codetown

Happy 10th year, JCertif!

Notes

Welcome to Codetown!

Codetown is a social network. It's got blogs, forums, groups, personal pages and more! You might think of Codetown as a funky camper van with lots of compartments for your stuff and a great multimedia system, too! Best of all, Codetown has room for all of your friends.

When you create a profile for yourself you get a personal page automatically. That's where you can be creative and do your own thing. People who want to get to know you will click on your name or picture and…
Continue

Created by Michael Levin Dec 18, 2008 at 6:56pm. Last updated by Michael Levin May 4, 2018.

Looking for Jobs or Staff?

Check out the Codetown Jobs group.

 

Enjoy the site? Support Codetown with your donation.



InfoQ Reading List

Presentation: Legacy Modernization: Architecting Realtime Systems Around a Mainframe

Jason Roberts and Sonia Mathew discuss architecting resilient real-time systems interacting with mainframes. They explain how Change Data Capture, Domain-Driven Design, Event-Driven Architecture, and Team Topologies were crucial for technical, organizational, and semantic decoupling. Learn their strategies for overcoming challenges with legacy systems and building a unified, scalable platform.

By Jason Roberts, Sonia Mathew

Anthropic Introduces Claude 4 Family and Claude Code

Anthropic released Claude Opus 4 and Sonnet 4, the newest versions of their Claude series of LLMs. Both models support extended thinking, tool use, and memory improvements, and Claude 4 Opus outperforms other LLMs on coding benchmarks.

By Anthony Alford

Amazon Open Sources Strands Agents SDK for Building AI Agents

Amazon has released Strands Agents, an open source SDK that simplifies AI agent development through a model-driven approach. The framework enables developers to build agents by defining prompts and tool lists with minimal code.

By Vinod Goje

Java News Roundup: GlassFish, JEPs Targeted for JDK 25, TornadoVM, Hibernate Reactive, Spring Cloud

This week's Java roundup for May 26th, 2025 features news highlighting: the twelfth milestone release of GlassFish 8.0; four JEPs targeted for JDK 25; introducing the GPULlama3.java project powered by TornadoVM; and GA releases of Hibernate Reactive 3.0, Spring Modulith 1.4 and Spring Cloud 2025.0.

By Michael Redlich

Article: The MVP Dilemma: Scale Now or Scale Later?

Scaling a system is a hard problem to solve. Underinvesting in scalability leads to a shortened lifespan for the system, but overinvesting can kill the MVP business case because of cost.

By Kurt Bittner, Pierre Pureur

© 2025   Created by Michael Levin.   Powered by

Badges  |  Report an Issue  |  Terms of Service