WEBVTT
Kind: captions
Language: en-US

00:00:00.480 --> 00:00:04.640
Hi, everyone. I’m Clara Yoon
from the USGS Pasadena office.

00:00:05.200 --> 00:00:09.440
And I’ll be telling you about the
potential for machine learning to

00:00:09.440 --> 00:00:13.280
really improve and modernize the
way earthquake monitoring is done.

00:00:14.000 --> 00:00:16.080
And I would like to
thank my colleagues

00:00:16.080 --> 00:00:17.920
from the
Caltech Seismo Lab.

00:00:22.560 --> 00:00:26.080
At a very high level, this is how
earthquake monitoring is done

00:00:26.720 --> 00:00:31.280
in a regional seismic network.
We start with continuous seismic

00:00:31.280 --> 00:00:35.600
data recorded at a few hundred
stations all across the region.

00:00:36.240 --> 00:00:41.760
And that data is fed in real time
into software that’s running on

00:00:41.760 --> 00:00:46.160
a bunch of computer servers.
And the output of the software

00:00:46.160 --> 00:00:50.080
is an earthquake catalog,
which tells you the location,

00:00:50.080 --> 00:00:54.160
time, and magnitude of all
the earthquakes in this region.

00:00:54.800 --> 00:00:57.840
Now, how does this
software actually work?

00:00:58.640 --> 00:01:01.520
Well, the software that we use
for earthquake monitoring

00:01:01.520 --> 00:01:06.160
is called AQMS/Earthworm.
It provides real-time automatic

00:01:06.160 --> 00:01:10.080
earthquake information.
And it’s over 20 years old.

00:01:10.080 --> 00:01:15.600
It’s been thoroughly tested and well-
tuned to run during this entire time.

00:01:16.320 --> 00:01:21.280
And how it works is, it starts with
the continuous seismic data as input.

00:01:21.920 --> 00:01:26.880
It detects earthquake signals and
then picks P and S phase arrivals

00:01:26.880 --> 00:01:30.880
and determines first-motion polarity.
And so these steps are done

00:01:30.880 --> 00:01:35.920
at every station. And the next step
is to take many phases from

00:01:35.920 --> 00:01:40.160
different stations and
associate them into an event.

00:01:40.160 --> 00:01:44.320
Then we locate that event,
compute its magnitude,

00:01:44.320 --> 00:01:49.120
determine its focal mechanism,
moment tensor for larger earthquakes –

00:01:49.120 --> 00:01:53.840
and so this – all this information
goes into the earthquake catalog.

00:01:55.440 --> 00:01:59.840
Now, the problem with this [inaudible]
software is that these automatic

00:01:59.840 --> 00:02:04.800
algorithms, especially for detection
and phase picking, they’re not perfect.

00:02:04.800 --> 00:02:06.640
Sometimes they
can make mistakes.

00:02:07.440 --> 00:02:14.560
And so our standard practice in the
seismic networks is to visually review

00:02:14.560 --> 00:02:18.640
all these earthquake
solutions by human analysts.

00:02:18.640 --> 00:02:21.840
And phases are manually
re-picked if necessary.

00:02:22.640 --> 00:02:25.040
And so here I’m showing
you an example.

00:02:25.040 --> 00:02:29.840
The green bars here on the left show
you where the automatic algorithm

00:02:29.840 --> 00:02:34.960
thinks the P and S phases should be.
The P phases look pretty good, but the

00:02:34.960 --> 00:02:39.760
S phases should definitely be earlier.
And so a human analyst really

00:02:39.760 --> 00:02:42.880
needs to go through and
manually pick all these events

00:02:42.880 --> 00:02:46.640
so that they line up
and now they’re correct.

00:02:46.640 --> 00:02:51.920
And so, looking towards the future, we
really want to be able to automate this

00:02:51.920 --> 00:02:57.680
entire earthquake monitoring process
without too much human intervention.

00:02:58.400 --> 00:03:03.120
And so that’s where artificial
intelligence, and more specifically,

00:03:03.120 --> 00:03:08.560
machine learning, could really play
a big role in entirely automating this

00:03:08.560 --> 00:03:12.560
earthquake monitoring process.
So the definition of machine learning

00:03:12.560 --> 00:03:17.760
is really algorithms that are able to
kind of learn on their own by example

00:03:17.760 --> 00:03:20.800
without explicitly
being told how to do so.

00:03:21.920 --> 00:03:27.280
And today I’ll talk about a subset of
machine learning called deep learning,

00:03:27.280 --> 00:03:33.040
which is where you have these artificial
neural networks that take massive data

00:03:33.040 --> 00:03:38.320
sets as input – so these are called
training data sets, and they already

00:03:38.320 --> 00:03:41.920
have known labels. So you already
know ahead of time what the answer is.

00:03:42.640 --> 00:03:49.680
And these will develop a model that can
learn from vast amounts of data

00:03:49.680 --> 00:03:54.160
in order to predict information
about new data that it’s never seen

00:03:54.160 --> 00:03:59.440
before in the future. And deep learning
has had tremendous impacts on many

00:03:59.440 --> 00:04:05.120
different fields, such image
recognition, medical diagnosis,

00:04:05.120 --> 00:04:10.800
speech recognition, self-driving cars,
and also in earthquake seismology

00:04:10.800 --> 00:04:13.840
and monitoring,
as I will discuss today.

00:04:16.240 --> 00:04:21.120
And so the majority of deep learning
research that could be applied to

00:04:21.120 --> 00:04:26.400
earthquake monitoring since 2018
has been done for earthquake

00:04:26.400 --> 00:04:30.000
detection and phase picking.
And that’s because this is

00:04:30.000 --> 00:04:34.480
a really tedious task that
requires a lot of human review.

00:04:34.480 --> 00:04:38.800
It’s really ripe for automation.
And what’s more, we already have

00:04:38.800 --> 00:04:44.240
these massive labeled data sets
available for training that are really a

00:04:44.240 --> 00:04:49.360
requirement for these types of models.
Because we’ve had human analysts

00:04:49.360 --> 00:04:54.640
over the decades build up millions
of examples of P picks and S picks.

00:04:55.280 --> 00:05:00.240
And now GPUs are widely available.
They get – they give us a lot of

00:05:00.240 --> 00:05:04.160
computational power.
And then we also have open source

00:05:04.160 --> 00:05:08.400
software libraries for deep learning,
such as TensorFlow and PyTorch

00:05:08.400 --> 00:05:11.440
that are well-documented
and accessible to non-experts.

00:05:12.160 --> 00:05:14.560
So all these things together
have really paved the way

00:05:14.560 --> 00:05:17.520
for deep learning in
earthquake monitoring.

00:05:20.240 --> 00:05:26.400
So here I’m showing you one of the first
examples of a deep learning model that

00:05:26.400 --> 00:05:29.440
has been developed for earthquake
detection and phase picking.

00:05:30.640 --> 00:05:35.280
This is called generalized phase
detection from Ross et al. 2018.

00:05:36.320 --> 00:05:42.400
The training data set that was input into
developing this model was really big.

00:05:42.400 --> 00:05:48.160
It’s from southern California.
It consisted of 1.5 million examples

00:05:48.160 --> 00:05:52.720
each of P waves, S waves,
and noise seismograms.

00:05:53.520 --> 00:05:58.320
So the input to this model is
a three-component waveform

00:05:58.320 --> 00:06:01.680
that’s a few seconds long.
And the deep learning model

00:06:01.680 --> 00:06:05.280
is able to automatically extract the most

00:06:05.280 --> 00:06:10.960
important features that really distinguish
a P wave from an S wave from noise.

00:06:10.960 --> 00:06:14.640
So this is one of the big advantages
of using deep learning.

00:06:15.440 --> 00:06:20.000
And then, when it all goes –
all this data goes through the model,

00:06:20.000 --> 00:06:24.960
the outputs are probability
that this input was a P wave

00:06:24.960 --> 00:06:30.000
or an S wave or noise.
So here’s an example of a P wave

00:06:30.000 --> 00:06:35.360
detection in red, with a corresponding
high probability during that time,

00:06:35.360 --> 00:06:39.280
and S wave detection in blue
with a high probability.

00:06:39.840 --> 00:06:42.320
And usually a threshold is
set on this probability,

00:06:42.960 --> 00:06:47.920
and then a phase is picked
within that time window.

00:06:47.920 --> 00:06:52.880
There’s been a tremendous amount of
research in the past couple of years that

00:06:52.880 --> 00:06:58.400
have developed these kind of
alternative deep learning models for

00:06:58.400 --> 00:07:02.560
earthquake detection and phase picking.
Here are some of the studies here

00:07:03.120 --> 00:07:08.080
listed here. I highlight three examples
of studies that have been trained

00:07:08.080 --> 00:07:12.640
on the largest data sets –
over 700,000 seismograms.

00:07:13.280 --> 00:07:16.640
So, on the left, we have PhaseNet,
which was trained on northern

00:07:16.640 --> 00:07:20.320
California data. In the middle,
we have PickNet, which was

00:07:20.320 --> 00:07:25.040
trained on data from Japan.
On the right, we have EQTransformer,

00:07:25.040 --> 00:07:30.560
which was trained on a global data set.
And a lot of research has really been

00:07:30.560 --> 00:07:34.560
focused on creating new models
and architectures, but they’ve all been

00:07:34.560 --> 00:07:38.120
trained on very different data sets.
And there’s been very limited a

00:07:38.120 --> 00:07:41.840
pples-to-apple comparisons
between these models.

00:07:41.840 --> 00:07:44.960
So that’s something that our
group is working on right now.

00:07:47.280 --> 00:07:53.520
And, where lately, deep learning models
have been applied to later parts of

00:07:53.520 --> 00:07:58.400
the monitoring workflow, such as
association and location and magnitude,

00:07:58.960 --> 00:08:03.680
and seismic networks are now
experimenting with adding

00:08:03.680 --> 00:08:07.120
deep learning to their operational
monitoring workflow.

00:08:07.760 --> 00:08:13.360
Some examples are the Oklahoma
Geological Survey and the USGS NEIC.

00:08:14.320 --> 00:08:18.320
We’re also working on this in the
Southern California Seismic Network.

00:08:19.360 --> 00:08:25.600
And we have a very ambitious plan a
t the Caltech Seismo Lab to completely

00:08:25.600 --> 00:08:29.920
rethink how earthquake
monitoring is done in a modern way.

00:08:29.920 --> 00:08:35.440
So this is the Quakes2AWS project,
where AWS is Amazon Web Services.

00:08:36.160 --> 00:08:41.040
So this new workflow is going to
be cloud-native, even serverless,

00:08:41.600 --> 00:08:46.320
so that the program
is available on demand.

00:08:46.320 --> 00:08:49.600
It’s very scalable in the
event of large earthquakes.

00:08:50.400 --> 00:08:55.520
This workflow will have a modular
architecture so that it’s really easy to

00:08:55.520 --> 00:08:58.640
swap different algorithms
in and out for testing.

00:09:00.000 --> 00:09:05.600
This workflow can easily incorporate
latest scientific advances, such as

00:09:05.600 --> 00:09:09.040
deep learning models for earthquake
detection and phase picking.

00:09:10.480 --> 00:09:15.440
And we envision that this workflow
can be applied to real-time data,

00:09:15.440 --> 00:09:20.880
as well as archive data for research,
and eventually in an internet-of-things

00:09:20.880 --> 00:09:26.080
sense to all kinds of sensors that
can be used for earthquake monitoring,

00:09:26.080 --> 00:09:28.560
such as smartphones
and smart devices.

00:09:30.880 --> 00:09:34.640
So I’ll quickly go through what
we’ve done so far in Quakes2AWS.

00:09:36.240 --> 00:09:40.560
So here we have data from seismic
stations coming into the system.

00:09:41.360 --> 00:09:46.400
And waveforms will go into the picker,
and then P and S picks go out of the

00:09:46.400 --> 00:09:50.480
picker, and then these picks go
into the associator, and then events

00:09:50.480 --> 00:09:54.400
come out of the associator, and so on.
So that’s the overview.

00:09:55.040 --> 00:10:01.520
And then here I summarize what
Quakes2AWS has implemented to date.

00:10:02.320 --> 00:10:07.920
So we have data from the seismic
stations coming in real time.

00:10:07.920 --> 00:10:11.440
And we actually put it into
the existing Earthworm system.

00:10:12.320 --> 00:10:18.400
And then the earthquake detection
and phase picking is actually done in

00:10:18.400 --> 00:10:23.600
the cloud using Amazon Web Services.
And then the picks come out of

00:10:23.600 --> 00:10:27.200
the cloud. And right now
they go into binder, which is

00:10:27.200 --> 00:10:32.160
the association step in Earthworm.
And then these associated events

00:10:32.160 --> 00:10:35.840
will go into the rest of the
AQMS/Earthworm software,

00:10:35.840 --> 00:10:39.280
so really just isolating the
detection and phase-picking steps.

00:10:40.400 --> 00:10:45.520
So peeking behind the curtain of
the cloud a little bit, we use

00:10:45.520 --> 00:10:51.200
Amazon Kinesis to do all the
real-time data streaming handling.

00:10:51.200 --> 00:10:56.480
And then – so the data goes in real time,
and then it’s fed into the deep learning

00:10:57.360 --> 00:11:03.040
detector and picker model, and it
outputs P and S waves and then picks.

00:11:03.040 --> 00:11:07.280
And then these picks will then go out
of the cloud and back into Earthworm.

00:11:09.680 --> 00:11:13.120
And so what do
we plan to do next?

00:11:13.120 --> 00:11:17.600
Well, we really want to focus on
a rigorous benchmarking and

00:11:17.600 --> 00:11:21.840
comparison study to understand
how good these deep learning models

00:11:21.840 --> 00:11:25.280
really are for earthquake detection
and phase picking and association.

00:11:25.840 --> 00:11:28.800
How ready are they
for operational prime time?

00:11:29.520 --> 00:11:34.160
So we want to compare kind of apples-
to-apples performance of these models

00:11:34.160 --> 00:11:38.960
on a common data set from southern
California, both against each other

00:11:38.960 --> 00:11:43.600
and also against the operational
AQMS/Earthworm software,

00:11:43.600 --> 00:11:46.320
both the automatic
pics and the manual picks.

00:11:47.280 --> 00:11:50.080
And then, in the longer term,
we want to continue building out

00:11:50.080 --> 00:11:54.960
this cloud-native Quakes2AWS
software and set it up as a parallel

00:11:54.960 --> 00:11:58.720
development earthquake monitoring
system, although this is on hold

00:11:58.720 --> 00:12:02.320
until we get more funding
for a cloud software architect.

00:12:05.200 --> 00:12:09.680
I’d like to summarize that there are
many potential operational benefits

00:12:09.680 --> 00:12:12.480
of these deep learning modules
for earthquake monitoring.

00:12:13.520 --> 00:12:19.120
It’s shown to be able to automatically
process data with very few errors.

00:12:19.120 --> 00:12:21.920
It can pick phases almost
as well as humans do.

00:12:23.360 --> 00:12:27.200
And then, once trained and tested,
we can – it’s easy to apply these

00:12:27.200 --> 00:12:30.400
deep learning models in near
real time. They’re pretty fast.

00:12:31.280 --> 00:12:36.240
And these models have also been
shown to perform well when earthquake

00:12:36.240 --> 00:12:40.000
sequences get really active,
such as aftershocks or swarms.

00:12:40.560 --> 00:12:44.400
So here’s an example of, you know,
a two-minute long sequence

00:12:44.400 --> 00:12:52.240
when the deep learning model
was able to many P and S arrivals.

00:12:52.240 --> 00:12:56.720
Looking further down the line,
deep learning also has a lot of

00:12:56.720 --> 00:13:00.400
scientific benefits because it’s
really good at finding small

00:13:00.400 --> 00:13:04.240
earthquakes that are not in the catalog.
So we’re really able to get more

00:13:04.240 --> 00:13:08.080
complete earthquake catalogs,
which are inputs into improving

00:13:08.080 --> 00:13:10.240
forecasting and seismic
hazard assessment.

00:13:10.880 --> 00:13:15.840
Here’s a magnitude frequency
distribution of earthquakes from

00:13:15.840 --> 00:13:19.600
the 2020 Puerto Rico sequence
that I detected with the

00:13:19.600 --> 00:13:24.160
EQTransformer deep learning model.
The dark blue shows the catalog

00:13:24.160 --> 00:13:28.480
earthquakes it detected,
and the light blue shows the

00:13:28.480 --> 00:13:33.760
smaller previously unknown
earthquakes that were detected –

00:13:33.760 --> 00:13:36.720
almost 9 times as many
as the catalog earthquakes.

00:13:38.160 --> 00:13:41.760
So better picks and more
earthquakes detected means

00:13:41.760 --> 00:13:45.760
more inputs into tomography,
which can improve velocity models

00:13:45.760 --> 00:13:47.840
and our understanding
of the subsurface.

00:13:49.200 --> 00:13:52.160
We can also get more detail
about earthquake processes,

00:13:52.160 --> 00:13:55.840
such as fine-scale fault
structure and fluid migration.

00:13:58.240 --> 00:14:02.160
I do want to point out that currently
deep learning does have some

00:14:02.160 --> 00:14:05.840
limitations before it’s completely
ready for earthquake monitoring.

00:14:07.120 --> 00:14:10.960
So here’s the same magnitude
frequency plot from the Puerto Rico

00:14:10.960 --> 00:14:15.440
sequence that I showed in the earlier
slide, but now, in red, I’ve added

00:14:15.440 --> 00:14:19.520
the earthquakes that the
EQTransformer deep learning model

00:14:19.520 --> 00:14:26.000
failed to detect but were in the catalog.
And you can see that this model

00:14:26.000 --> 00:14:30.400
missed some of the larger earthquakes,
including the magnitude 6.4

00:14:30.400 --> 00:14:35.280
main shock. And that’s definitely
a problem for operational monitoring

00:14:35.280 --> 00:14:39.280
because, you know, we don’t want
to miss out on the largest earthquakes

00:14:39.280 --> 00:14:42.080
which are really the
most important ones.

00:14:42.800 --> 00:14:47.360
And so I’d say that, at this point,
deep learning is ready to augment

00:14:47.360 --> 00:14:52.080
or add new information to existing
earthquake catalogs but not yet ready

00:14:52.080 --> 00:14:55.840
to completely replace existing
earthquake monitoring systems.

00:14:56.560 --> 00:15:01.200
And more research needs to be done to
address questions like why these

00:15:01.200 --> 00:15:06.640
models fail to detect the larger
earthquakes that don’t happen as often.

00:15:06.640 --> 00:15:11.840
And also, how can – what can
we do to improve these models?

00:15:11.840 --> 00:15:16.640
So, to summarize, I’d like to end
by saying that machine learning,

00:15:16.640 --> 00:15:20.960
and especially deep learning,
have a lot of potential to improve

00:15:20.960 --> 00:15:24.960
and automate and modernize the way
that earthquake monitoring is done.

00:15:25.520 --> 00:15:29.200
And there are many operational and
scientific benefits that will come out

00:15:29.200 --> 00:15:34.000
of using these deep learning models.
And so I hope that, in the future,

00:15:34.560 --> 00:15:40.320
the path from continuous seismic data
to earthquake catalog will go through

00:15:40.320 --> 00:15:43.920
our new Quakes2AWS workflow
that we’re currently developing.

00:15:44.640 --> 00:15:49.280
It really combines a modular
cloud-native software architecture

00:15:49.280 --> 00:15:53.280
with benchmark deep learning models
for every part of the earthquake

00:15:53.280 --> 00:15:58.000
monitoring workflow. And here’s my
email if you would like to contact me.