WEBVTT Kind: captions Language: en-US 00:00:00.480 --> 00:00:04.640 Hi, everyone. I’m Clara Yoon from the USGS Pasadena office. 00:00:05.200 --> 00:00:09.440 And I’ll be telling you about the potential for machine learning to 00:00:09.440 --> 00:00:13.280 really improve and modernize the way earthquake monitoring is done. 00:00:14.000 --> 00:00:16.080 And I would like to thank my colleagues 00:00:16.080 --> 00:00:17.920 from the Caltech Seismo Lab. 00:00:22.560 --> 00:00:26.080 At a very high level, this is how earthquake monitoring is done 00:00:26.720 --> 00:00:31.280 in a regional seismic network. We start with continuous seismic 00:00:31.280 --> 00:00:35.600 data recorded at a few hundred stations all across the region. 00:00:36.240 --> 00:00:41.760 And that data is fed in real time into software that’s running on 00:00:41.760 --> 00:00:46.160 a bunch of computer servers. And the output of the software 00:00:46.160 --> 00:00:50.080 is an earthquake catalog, which tells you the location, 00:00:50.080 --> 00:00:54.160 time, and magnitude of all the earthquakes in this region. 00:00:54.800 --> 00:00:57.840 Now, how does this software actually work? 00:00:58.640 --> 00:01:01.520 Well, the software that we use for earthquake monitoring 00:01:01.520 --> 00:01:06.160 is called AQMS/Earthworm. It provides real-time automatic 00:01:06.160 --> 00:01:10.080 earthquake information. And it’s over 20 years old. 00:01:10.080 --> 00:01:15.600 It’s been thoroughly tested and well- tuned to run during this entire time. 00:01:16.320 --> 00:01:21.280 And how it works is, it starts with the continuous seismic data as input. 00:01:21.920 --> 00:01:26.880 It detects earthquake signals and then picks P and S phase arrivals 00:01:26.880 --> 00:01:30.880 and determines first-motion polarity. And so these steps are done 00:01:30.880 --> 00:01:35.920 at every station. And the next step is to take many phases from 00:01:35.920 --> 00:01:40.160 different stations and associate them into an event. 00:01:40.160 --> 00:01:44.320 Then we locate that event, compute its magnitude, 00:01:44.320 --> 00:01:49.120 determine its focal mechanism, moment tensor for larger earthquakes – 00:01:49.120 --> 00:01:53.840 and so this – all this information goes into the earthquake catalog. 00:01:55.440 --> 00:01:59.840 Now, the problem with this [inaudible] software is that these automatic 00:01:59.840 --> 00:02:04.800 algorithms, especially for detection and phase picking, they’re not perfect. 00:02:04.800 --> 00:02:06.640 Sometimes they can make mistakes. 00:02:07.440 --> 00:02:14.560 And so our standard practice in the seismic networks is to visually review 00:02:14.560 --> 00:02:18.640 all these earthquake solutions by human analysts. 00:02:18.640 --> 00:02:21.840 And phases are manually re-picked if necessary. 00:02:22.640 --> 00:02:25.040 And so here I’m showing you an example. 00:02:25.040 --> 00:02:29.840 The green bars here on the left show you where the automatic algorithm 00:02:29.840 --> 00:02:34.960 thinks the P and S phases should be. The P phases look pretty good, but the 00:02:34.960 --> 00:02:39.760 S phases should definitely be earlier. And so a human analyst really 00:02:39.760 --> 00:02:42.880 needs to go through and manually pick all these events 00:02:42.880 --> 00:02:46.640 so that they line up and now they’re correct. 00:02:46.640 --> 00:02:51.920 And so, looking towards the future, we really want to be able to automate this 00:02:51.920 --> 00:02:57.680 entire earthquake monitoring process without too much human intervention. 00:02:58.400 --> 00:03:03.120 And so that’s where artificial intelligence, and more specifically, 00:03:03.120 --> 00:03:08.560 machine learning, could really play a big role in entirely automating this 00:03:08.560 --> 00:03:12.560 earthquake monitoring process. So the definition of machine learning 00:03:12.560 --> 00:03:17.760 is really algorithms that are able to kind of learn on their own by example 00:03:17.760 --> 00:03:20.800 without explicitly being told how to do so. 00:03:21.920 --> 00:03:27.280 And today I’ll talk about a subset of machine learning called deep learning, 00:03:27.280 --> 00:03:33.040 which is where you have these artificial neural networks that take massive data 00:03:33.040 --> 00:03:38.320 sets as input – so these are called training data sets, and they already 00:03:38.320 --> 00:03:41.920 have known labels. So you already know ahead of time what the answer is. 00:03:42.640 --> 00:03:49.680 And these will develop a model that can learn from vast amounts of data 00:03:49.680 --> 00:03:54.160 in order to predict information about new data that it’s never seen 00:03:54.160 --> 00:03:59.440 before in the future. And deep learning has had tremendous impacts on many 00:03:59.440 --> 00:04:05.120 different fields, such image recognition, medical diagnosis, 00:04:05.120 --> 00:04:10.800 speech recognition, self-driving cars, and also in earthquake seismology 00:04:10.800 --> 00:04:13.840 and monitoring, as I will discuss today. 00:04:16.240 --> 00:04:21.120 And so the majority of deep learning research that could be applied to 00:04:21.120 --> 00:04:26.400 earthquake monitoring since 2018 has been done for earthquake 00:04:26.400 --> 00:04:30.000 detection and phase picking. And that’s because this is 00:04:30.000 --> 00:04:34.480 a really tedious task that requires a lot of human review. 00:04:34.480 --> 00:04:38.800 It’s really ripe for automation. And what’s more, we already have 00:04:38.800 --> 00:04:44.240 these massive labeled data sets available for training that are really a 00:04:44.240 --> 00:04:49.360 requirement for these types of models. Because we’ve had human analysts 00:04:49.360 --> 00:04:54.640 over the decades build up millions of examples of P picks and S picks. 00:04:55.280 --> 00:05:00.240 And now GPUs are widely available. They get – they give us a lot of 00:05:00.240 --> 00:05:04.160 computational power. And then we also have open source 00:05:04.160 --> 00:05:08.400 software libraries for deep learning, such as TensorFlow and PyTorch 00:05:08.400 --> 00:05:11.440 that are well-documented and accessible to non-experts. 00:05:12.160 --> 00:05:14.560 So all these things together have really paved the way 00:05:14.560 --> 00:05:17.520 for deep learning in earthquake monitoring. 00:05:20.240 --> 00:05:26.400 So here I’m showing you one of the first examples of a deep learning model that 00:05:26.400 --> 00:05:29.440 has been developed for earthquake detection and phase picking. 00:05:30.640 --> 00:05:35.280 This is called generalized phase detection from Ross et al. 2018. 00:05:36.320 --> 00:05:42.400 The training data set that was input into developing this model was really big. 00:05:42.400 --> 00:05:48.160 It’s from southern California. It consisted of 1.5 million examples 00:05:48.160 --> 00:05:52.720 each of P waves, S waves, and noise seismograms. 00:05:53.520 --> 00:05:58.320 So the input to this model is a three-component waveform 00:05:58.320 --> 00:06:01.680 that’s a few seconds long. And the deep learning model 00:06:01.680 --> 00:06:05.280 is able to automatically extract the most 00:06:05.280 --> 00:06:10.960 important features that really distinguish a P wave from an S wave from noise. 00:06:10.960 --> 00:06:14.640 So this is one of the big advantages of using deep learning. 00:06:15.440 --> 00:06:20.000 And then, when it all goes – all this data goes through the model, 00:06:20.000 --> 00:06:24.960 the outputs are probability that this input was a P wave 00:06:24.960 --> 00:06:30.000 or an S wave or noise. So here’s an example of a P wave 00:06:30.000 --> 00:06:35.360 detection in red, with a corresponding high probability during that time, 00:06:35.360 --> 00:06:39.280 and S wave detection in blue with a high probability. 00:06:39.840 --> 00:06:42.320 And usually a threshold is set on this probability, 00:06:42.960 --> 00:06:47.920 and then a phase is picked within that time window. 00:06:47.920 --> 00:06:52.880 There’s been a tremendous amount of research in the past couple of years that 00:06:52.880 --> 00:06:58.400 have developed these kind of alternative deep learning models for 00:06:58.400 --> 00:07:02.560 earthquake detection and phase picking. Here are some of the studies here 00:07:03.120 --> 00:07:08.080 listed here. I highlight three examples of studies that have been trained 00:07:08.080 --> 00:07:12.640 on the largest data sets – over 700,000 seismograms. 00:07:13.280 --> 00:07:16.640 So, on the left, we have PhaseNet, which was trained on northern 00:07:16.640 --> 00:07:20.320 California data. In the middle, we have PickNet, which was 00:07:20.320 --> 00:07:25.040 trained on data from Japan. On the right, we have EQTransformer, 00:07:25.040 --> 00:07:30.560 which was trained on a global data set. And a lot of research has really been 00:07:30.560 --> 00:07:34.560 focused on creating new models and architectures, but they’ve all been 00:07:34.560 --> 00:07:38.120 trained on very different data sets. And there’s been very limited a 00:07:38.120 --> 00:07:41.840 pples-to-apple comparisons between these models. 00:07:41.840 --> 00:07:44.960 So that’s something that our group is working on right now. 00:07:47.280 --> 00:07:53.520 And, where lately, deep learning models have been applied to later parts of 00:07:53.520 --> 00:07:58.400 the monitoring workflow, such as association and location and magnitude, 00:07:58.960 --> 00:08:03.680 and seismic networks are now experimenting with adding 00:08:03.680 --> 00:08:07.120 deep learning to their operational monitoring workflow. 00:08:07.760 --> 00:08:13.360 Some examples are the Oklahoma Geological Survey and the USGS NEIC. 00:08:14.320 --> 00:08:18.320 We’re also working on this in the Southern California Seismic Network. 00:08:19.360 --> 00:08:25.600 And we have a very ambitious plan a t the Caltech Seismo Lab to completely 00:08:25.600 --> 00:08:29.920 rethink how earthquake monitoring is done in a modern way. 00:08:29.920 --> 00:08:35.440 So this is the Quakes2AWS project, where AWS is Amazon Web Services. 00:08:36.160 --> 00:08:41.040 So this new workflow is going to be cloud-native, even serverless, 00:08:41.600 --> 00:08:46.320 so that the program is available on demand. 00:08:46.320 --> 00:08:49.600 It’s very scalable in the event of large earthquakes. 00:08:50.400 --> 00:08:55.520 This workflow will have a modular architecture so that it’s really easy to 00:08:55.520 --> 00:08:58.640 swap different algorithms in and out for testing. 00:09:00.000 --> 00:09:05.600 This workflow can easily incorporate latest scientific advances, such as 00:09:05.600 --> 00:09:09.040 deep learning models for earthquake detection and phase picking. 00:09:10.480 --> 00:09:15.440 And we envision that this workflow can be applied to real-time data, 00:09:15.440 --> 00:09:20.880 as well as archive data for research, and eventually in an internet-of-things 00:09:20.880 --> 00:09:26.080 sense to all kinds of sensors that can be used for earthquake monitoring, 00:09:26.080 --> 00:09:28.560 such as smartphones and smart devices. 00:09:30.880 --> 00:09:34.640 So I’ll quickly go through what we’ve done so far in Quakes2AWS. 00:09:36.240 --> 00:09:40.560 So here we have data from seismic stations coming into the system. 00:09:41.360 --> 00:09:46.400 And waveforms will go into the picker, and then P and S picks go out of the 00:09:46.400 --> 00:09:50.480 picker, and then these picks go into the associator, and then events 00:09:50.480 --> 00:09:54.400 come out of the associator, and so on. So that’s the overview. 00:09:55.040 --> 00:10:01.520 And then here I summarize what Quakes2AWS has implemented to date. 00:10:02.320 --> 00:10:07.920 So we have data from the seismic stations coming in real time. 00:10:07.920 --> 00:10:11.440 And we actually put it into the existing Earthworm system. 00:10:12.320 --> 00:10:18.400 And then the earthquake detection and phase picking is actually done in 00:10:18.400 --> 00:10:23.600 the cloud using Amazon Web Services. And then the picks come out of 00:10:23.600 --> 00:10:27.200 the cloud. And right now they go into binder, which is 00:10:27.200 --> 00:10:32.160 the association step in Earthworm. And then these associated events 00:10:32.160 --> 00:10:35.840 will go into the rest of the AQMS/Earthworm software, 00:10:35.840 --> 00:10:39.280 so really just isolating the detection and phase-picking steps. 00:10:40.400 --> 00:10:45.520 So peeking behind the curtain of the cloud a little bit, we use 00:10:45.520 --> 00:10:51.200 Amazon Kinesis to do all the real-time data streaming handling. 00:10:51.200 --> 00:10:56.480 And then – so the data goes in real time, and then it’s fed into the deep learning 00:10:57.360 --> 00:11:03.040 detector and picker model, and it outputs P and S waves and then picks. 00:11:03.040 --> 00:11:07.280 And then these picks will then go out of the cloud and back into Earthworm. 00:11:09.680 --> 00:11:13.120 And so what do we plan to do next? 00:11:13.120 --> 00:11:17.600 Well, we really want to focus on a rigorous benchmarking and 00:11:17.600 --> 00:11:21.840 comparison study to understand how good these deep learning models 00:11:21.840 --> 00:11:25.280 really are for earthquake detection and phase picking and association. 00:11:25.840 --> 00:11:28.800 How ready are they for operational prime time? 00:11:29.520 --> 00:11:34.160 So we want to compare kind of apples- to-apples performance of these models 00:11:34.160 --> 00:11:38.960 on a common data set from southern California, both against each other 00:11:38.960 --> 00:11:43.600 and also against the operational AQMS/Earthworm software, 00:11:43.600 --> 00:11:46.320 both the automatic pics and the manual picks. 00:11:47.280 --> 00:11:50.080 And then, in the longer term, we want to continue building out 00:11:50.080 --> 00:11:54.960 this cloud-native Quakes2AWS software and set it up as a parallel 00:11:54.960 --> 00:11:58.720 development earthquake monitoring system, although this is on hold 00:11:58.720 --> 00:12:02.320 until we get more funding for a cloud software architect. 00:12:05.200 --> 00:12:09.680 I’d like to summarize that there are many potential operational benefits 00:12:09.680 --> 00:12:12.480 of these deep learning modules for earthquake monitoring. 00:12:13.520 --> 00:12:19.120 It’s shown to be able to automatically process data with very few errors. 00:12:19.120 --> 00:12:21.920 It can pick phases almost as well as humans do. 00:12:23.360 --> 00:12:27.200 And then, once trained and tested, we can – it’s easy to apply these 00:12:27.200 --> 00:12:30.400 deep learning models in near real time. They’re pretty fast. 00:12:31.280 --> 00:12:36.240 And these models have also been shown to perform well when earthquake 00:12:36.240 --> 00:12:40.000 sequences get really active, such as aftershocks or swarms. 00:12:40.560 --> 00:12:44.400 So here’s an example of, you know, a two-minute long sequence 00:12:44.400 --> 00:12:52.240 when the deep learning model was able to many P and S arrivals. 00:12:52.240 --> 00:12:56.720 Looking further down the line, deep learning also has a lot of 00:12:56.720 --> 00:13:00.400 scientific benefits because it’s really good at finding small 00:13:00.400 --> 00:13:04.240 earthquakes that are not in the catalog. So we’re really able to get more 00:13:04.240 --> 00:13:08.080 complete earthquake catalogs, which are inputs into improving 00:13:08.080 --> 00:13:10.240 forecasting and seismic hazard assessment. 00:13:10.880 --> 00:13:15.840 Here’s a magnitude frequency distribution of earthquakes from 00:13:15.840 --> 00:13:19.600 the 2020 Puerto Rico sequence that I detected with the 00:13:19.600 --> 00:13:24.160 EQTransformer deep learning model. The dark blue shows the catalog 00:13:24.160 --> 00:13:28.480 earthquakes it detected, and the light blue shows the 00:13:28.480 --> 00:13:33.760 smaller previously unknown earthquakes that were detected – 00:13:33.760 --> 00:13:36.720 almost 9 times as many as the catalog earthquakes. 00:13:38.160 --> 00:13:41.760 So better picks and more earthquakes detected means 00:13:41.760 --> 00:13:45.760 more inputs into tomography, which can improve velocity models 00:13:45.760 --> 00:13:47.840 and our understanding of the subsurface. 00:13:49.200 --> 00:13:52.160 We can also get more detail about earthquake processes, 00:13:52.160 --> 00:13:55.840 such as fine-scale fault structure and fluid migration. 00:13:58.240 --> 00:14:02.160 I do want to point out that currently deep learning does have some 00:14:02.160 --> 00:14:05.840 limitations before it’s completely ready for earthquake monitoring. 00:14:07.120 --> 00:14:10.960 So here’s the same magnitude frequency plot from the Puerto Rico 00:14:10.960 --> 00:14:15.440 sequence that I showed in the earlier slide, but now, in red, I’ve added 00:14:15.440 --> 00:14:19.520 the earthquakes that the EQTransformer deep learning model 00:14:19.520 --> 00:14:26.000 failed to detect but were in the catalog. And you can see that this model 00:14:26.000 --> 00:14:30.400 missed some of the larger earthquakes, including the magnitude 6.4 00:14:30.400 --> 00:14:35.280 main shock. And that’s definitely a problem for operational monitoring 00:14:35.280 --> 00:14:39.280 because, you know, we don’t want to miss out on the largest earthquakes 00:14:39.280 --> 00:14:42.080 which are really the most important ones. 00:14:42.800 --> 00:14:47.360 And so I’d say that, at this point, deep learning is ready to augment 00:14:47.360 --> 00:14:52.080 or add new information to existing earthquake catalogs but not yet ready 00:14:52.080 --> 00:14:55.840 to completely replace existing earthquake monitoring systems. 00:14:56.560 --> 00:15:01.200 And more research needs to be done to address questions like why these 00:15:01.200 --> 00:15:06.640 models fail to detect the larger earthquakes that don’t happen as often. 00:15:06.640 --> 00:15:11.840 And also, how can – what can we do to improve these models? 00:15:11.840 --> 00:15:16.640 So, to summarize, I’d like to end by saying that machine learning, 00:15:16.640 --> 00:15:20.960 and especially deep learning, have a lot of potential to improve 00:15:20.960 --> 00:15:24.960 and automate and modernize the way that earthquake monitoring is done. 00:15:25.520 --> 00:15:29.200 And there are many operational and scientific benefits that will come out 00:15:29.200 --> 00:15:34.000 of using these deep learning models. And so I hope that, in the future, 00:15:34.560 --> 00:15:40.320 the path from continuous seismic data to earthquake catalog will go through 00:15:40.320 --> 00:15:43.920 our new Quakes2AWS workflow that we’re currently developing. 00:15:44.640 --> 00:15:49.280 It really combines a modular cloud-native software architecture 00:15:49.280 --> 00:15:53.280 with benchmark deep learning models for every part of the earthquake 00:15:53.280 --> 00:15:58.000 monitoring workflow. And here’s my email if you would like to contact me.