CST 383 - Intro to Data Science | Week 3

Learning log 3:

This week I learned more about how to describe and visualize different types of variables. A lot of the week focused on continuous variables, like using density plots, histograms, and box plots to understand the shape of data. I also learned that the way you choose bins in a histogram can change how the distribution looks, so graphs are not just automatic answers. They need choices that make sense.

The famous distributions topic helped me see why distributions matter in data science. The normal distribution is starting to make more sense, especially the idea of mean, standard deviation, and how values are spread out. I still find probability density a little confusing because the y-axis is not exactly a probability by itself. I understand that the area under the curve matters, but I need more practice with that idea.

For two continuous variables, correlation and visualization were useful because they show how two things can move together. I understand that correlation can show a relationship, but it does not prove one thing causes another. I also liked connecting this to scatterplots because it is easier to see the relationship visually.

The discrete variable topics helped me compare continuous and discrete data. I am starting to understand that different kinds of variables need different kinds of summaries and plots. One question I still have is how to quickly decide which visualization is best for a new dataset. This week helped me see that statistics and visualization work together and that good graphs can make data much easier to understand.

Comments

Popular posts from this blog

CST 334 - Week 3

CST 334 - Week 2

CST 370-30 - Algorithm Design & Analysis Week 1