[MSNoise] [EXTERNAL] Re: MWCS sql crash

Flinders, Ashton aflinders at usgs.gov
Fri Oct 19 19:13:54 UTC 2018


It ran for less than a day. Approximately ~12k MWCS jobs, but not all of
those ran because of the REF file issue. I reran it last night, and it
ended just in the last hour - and while it did compute all the jobs, it
gave me the same error at the end. So it seems like an exit error once the
compute_mwcs job is done (no exit code, and then SQL hangs?).

Of course this is could all be complicated since I am using a Slurm job
scheduler to hand processor assignment.

On Fri, Oct 19, 2018 at 11:01 AM Thomas Lecocq <Thomas.Lecocq at seismology.be>
wrote:

> Ashton,
>
> No, I don't think it's linked. If the REF file is not available, the
> code should crash and not hang.
>
> How long did your MWCS ran ? How many MWCS jobs are there ? How many
> stations / stations-pairs ?
>
> what is the content of your my.cnf / or mysql configuration file ?
>
> Thomas
>
>
> Le 19/10/2018 à 18:53, Flinders, Ashton a écrit :
> > Hi Thomas, I actually think this was related to the PR I submitted the
> > other day. Since I have a mix of stations (some 3-comp some only Z), when
> > mwcs_compute tried to calculate RR for a station-pair that only had ZZ,
> and
> > it couldnt find the reference function it crashed/hanged. Then after a
> > while hanging it threw the SQL error.
> >
> > On Thu, Oct 18, 2018 at 11:15 PM Thomas Lecocq <
> Thomas.Lecocq at seismology.be>
> > wrote:
> >
> >> Hi Ashton
> >>
> >> it seems your MWCS computation took a looooooong time and the MySQL
> >> connection was killed during that time. Can you confirm ?
> >>
> >> Thomas
> >>
> >>
> >> Le 18/10/2018 à 18:54, Flinders, Ashton a écrit :
> >>> I get a strange crash part way through my MWCS step (see below), and
> >>> compute_MWCS is not finishing. E.g. I have 5 frequency bands, but for
> >> bands
> >>> 2-4 only 1 of 10 station pair MWCS's get calculated, even though all
> the
> >>> data is there in the stacks. I have tried rerunning comute_mwcs by
> >> changing
> >>> the flag back to 'T' for the station pairs where mwcs did not get
> >>> calculated, but it still crashes. This crash is repeatable.
> >>>
> >>> Any thoughts?
> >>>
> >>> (p.s. I also initially tried remaking the stacks, but it crashed at the
> >>> same point. The data looks good in the stacks)
> >>>
> >>> -ashton
> >>>
> >>> During handling of the above exception, another exception occurred:
> >>>
> >>>
> >>> Traceback (most recent call last):
> >>>
> >>>     File
> >>>
> >>
> "/home/ashton/.local/lib/python3.5/site-packages/sqlalchemy/engine/base.py",
> >>> line 1139, in _execute_context
> >>>
> >>>       context)
> >>>
> >>>     File
> >>>
> >>
> "/home/ashton/.local/lib/python3.5/site-packages/sqlalchemy/engine/default.py",
> >>> line 450, in do_execute
> >>>
> >>>       cursor.execute(statement, parameters)
> >>>
> >>>     File
> >>>
> >>
> "/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/cursors.py",
> >>> line 165, in execute
> >>>
> >>>       result = self._query(query)
> >>>
> >>>     File
> >>>
> >>
> "/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/cursors.py",
> >>> line 321, in _query
> >>>
> >>>       conn.query(q)
> >>>
> >>>     File
> >>>
> >>
> "/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/connections.py",
> >>> line 859, in query
> >>>
> >>>       self._execute_command(COMMAND.COM_QUERY, sql)
> >>>
> >>>     File
> >>>
> >>
> "/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/connections.py",
> >>> line 1096, in _execute_command
> >>>
> >>>       self._write_bytes(packet)
> >>>
> >>>     File
> >>>
> >>
> "/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/connections.py",
> >>> line 1048, in _write_bytes
> >>>
> >>>       "MySQL server has gone away (%r)" % (e,))
> >>>
> >>> pymysql.err.OperationalError: (2006, "MySQL server has gone away
> >>> (BrokenPipeError(32, 'Broken pipe'))")
> >>>
> >>>
> >>> The above exception was the direct cause of the following exception:
> >>>
> >>>
> >>>
> >> _______________________________________________
> >> MSNoise mailing list
> >> MSNoise at mailman-as.oma.be
> >> http://mailman-as.oma.be/mailman/listinfo/msnoise
> >>
> >
>
> _______________________________________________
> MSNoise mailing list
> MSNoise at mailman-as.oma.be
> http://mailman-as.oma.be/mailman/listinfo/msnoise
>


-- 
Ashton F. Flinders, Ph.D
U.S. Geological Survey
345 Middlefield Road
Menlo Park, CA 94025
(650) 329-5050


More information about the MSNoise mailing list