top of page

Predicting Earthquake Patterns with ML

In this project, I decided to breakdown seismic data to unravel the earthquake patterns across different U.S. states. This project included many new and challenging hurdles for me and required me to conduct extensive research and apply complex analytical techniques.


Explore the code on my GitHub: https://lnkd.in/d6BXSDB2.
Here is the dataset I used:https://lnkd.in/d7aFq76m

WhatsApp Image 2023-09-19 at 00.14.17.jpg

This was the most interesting part of my project. To produce this scatter plot I was required to implement my Python skills to break down my data set and accordingly calculate the mean, max, and min of the magnitudes for each specific state. This was a very challenging process for me and required me to research on the internet on how to achieve the following result.

​

As for the data itself, Oregon emerged as the state with the highest mean earthquake magnitude, registering an average of 4.2 on the Richter scale.

Breakdown by states

WhatsApp Image 2023-09-19 at 00.14.39.jpg

The following pie chart breaks down the frequency of 'Earthquake-related deaths' by state. From the data, it can be said that more than half of all deaths in the US occurred in California. This is a surprising finding as it does not have the most destructive earthquakes as seen in the scatterplot above. Earthquakes in California averaged to only 1.3 on the Richter scale. Such a count of deaths could also be explained by the general frequency of earthquakes occurring in California, this variable is measured in the bar lot below. 

WhatsApp Image 2023-09-19 at 00.14.52.jpg

The bar plot above illustrates the frequency of earthquakes occurring in the U.S. states. From this visualization we can see California comes out to the top with having the highest frequency, nearly 650 earthquakes, making it the state with the most seismic activity. Followed by Alaska, with approximately 100 earthquakes.

​

This visualization provides a clear comparison of earthquake occurrences among different states. The stark contrast between California and Alaska indicates that California experiences significantly more earthquakes than most other states, making it a region of high seismic activity.

 

This outlier can easily be explained by the very fact that California is located on multiple tectonic plates, making it one of the most seismically active regions in the United States. The state is mainly situated on the Pacific Plate and the North American Plate. The boundary between these plates is known as the San Andreas Fault, which is one of the most well-known fault lines in the world. This can also be seen on the map below:

image.png

Machine Learning

WhatsApp Image 2023-09-19 at 00.21.14.jpg

As for the ML component in my project, I had decided to create an interface allowing users to input any value that would be considered the magnitude. Subsequently, my model would then determine and provide the name of the state where earthquakes of that magnitude were most likely to have occurred.

​

I was genuinely thrilled to incorporate machine learning into my project because it felt like a form of 'wizardry' to me. It was fascinating to witness how my code could 'reason' and process my datasets autonomously, ultimately delivering results without requiring my constant intervention.

Conclusion and Evaluation

In conclusion, given this was my first python project in which I analyzed and visualized real-life data. I would like to say I am fairly satisfied with the results. I was challenged in my many different ways and lead me to learn new concepts and theories in this subjectThis project allowed me to not only understand the art of data analysis and visualization but also developed a keen eye for identifying meaningful patterns within intricate datasets. 

​

If I were to do this project again, I would like to research ways to include a much larger database. This would be really beneficial for me as my ML would become more accurate and provide more reliable outputs.

bottom of page