[MSNoise] MSNoise Digest, Vol 27, Issue 2

Mon May 2 17:30:33 UTC 2016

Hi All; 

I would say that before processing a certain dataset it is important to get to know your data and define what are you looking for, 
or scientific question. If one knows or understands the source that is generating the random wave field and its interaction with the medium and the susceptibility to changes then it is easier to select a dataset, moving windows, frequencies, etc. 
For instance, Initially small subsets centered at some specific transient event work well. 

Esteban

> On May 2, 2016, at 12:44 AM, msnoise-request at mailman-as.oma.be wrote:
> 
> Send MSNoise mailing list submissions to
> 	msnoise at mailman-as.oma.be
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://mailman-as.oma.be/mailman/listinfo/msnoise
> or, via email, send a message with subject or body 'help' to
> 	msnoise-request at mailman-as.oma.be
> 
> You can reach the person managing the list at
> 	msnoise-owner at mailman-as.oma.be
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of MSNoise digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: advice on processing database subsets (Lukas Preiswerk)
>   2. Re: advice on processing database subsets (Thomas Lecocq)
>   3. Re: advice on processing database subsets (Phil Cummins)
>   4. Re: advice on processing database subsets (Thomas Lecocq)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Sun, 1 May 2016 16:52:06 +0200
> From: Lukas Preiswerk <preiswerk at vaw.baug.ethz.ch>
> To: Python Package for Monitoring Seismic Velocity Changes using
> 	Ambient	Seismic Noise <msnoise at mailman-as.oma.be>
> Subject: Re: [MSNoise] advice on processing database subsets
> Message-ID:
> 	<CAOSnoQ3rCc5U7i9TVTBMKngZfcM1gCYT7paudi9Hg=x=rCSG3g at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
> 
> Hi all
> 
> I was in a similar situation as Phil, and I used (1). It?s not
> straightforward to copy the database and make msnoise work again in a new
> directory. But it?s definitely possible.
> I actually think it would be a nice addition to msnoise to not only make an
> option for multiple filters, but also for multiple other parameters (window
> lengths, overlaps, windsorizing, etc.). This would really help in the first
> ?exploratory phase? to find out what is the best way to process your
> dataset.
> What do you think of this idea? Practically I would implement it by moving
> these parameters (window length etc.) to the filter parameters, and treat
> it in the same way as an additional filter. As far as I understand the
> code, this wouldn?t require many adaptions?
> 
> Lukas
> 
> 
> 
> 2016-05-01 11:35 GMT+02:00 Thomas Lecocq <Thomas.Lecocq at seismology.be>:
> 
>> Hi Phil,
>> 
>> I'd say (3) would be better indeed. You can script msnoise using the api.
>> If you need to change params in the config, you can alternatively use the
>> "msnoise config --set name=value" command.
>> 
>> Please keep me updated of your progresses & tests !
>> 
>> Thomas
>> 
>> 
>> 
>> On 01/05/2016 10:34, Phil Cummins wrote:
>> 
>>> Hi again,
>>> 
>>> As some of you may recall, I'm just getting started with msnoise. I have
>>> a large database and have managed to get my station and data availability
>>> tables populated.
>>> At this point, rather than running through the whole database, processing
>>> it with parameters I hope might work, I'd rather process small subsets,
>>> e.g. 1 day at a time, to experiment with window lengths, overlaps, etc., to
>>> find what seems optimal. My question is, what's the best way to process
>>> subsets of my database?
>>> It seems to me I have several options:
>>>    (1) Make separate databases for each subset I want to test, and run
>>> through the workflow on each
>>>    (2) Set start and end times appropriate for my subset, re-scan and
>>> run through the workflow.
>>>    (3) Populate the jobs table, and write a script to activate only the
>>> jobs I want and not the others.
>>> I want to a fair bit of testing using different parameters before I run
>>> through the whole thing, so I think (3) may be best. But any advice would
>>> be appreciated.
>>> Regards,
>>> 
>>> - Phil
>>> _______________________________________________
>>> MSNoise mailing list
>>> MSNoise at mailman-as.oma.be
>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>> 
>> 
>> _______________________________________________
>> MSNoise mailing list
>> MSNoise at mailman-as.oma.be
>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Sun, 1 May 2016 20:18:26 +0200
> From: Thomas Lecocq <Thomas.Lecocq at seismology.be>
> To: msnoise at mailman-as.oma.be
> Subject: Re: [MSNoise] advice on processing database subsets
> Message-ID: <8ee5b4f1-ce82-fa13-b898-9ddd1743451e at seismology.be>
> Content-Type: text/plain; charset=utf-8; format=flowed
> 
> Hi guys,
> 
> Yeah, I have been thinking about a "benchmark" mode for quite a number 
> of weeks, i.e. since I tested a first run of PWS in order to compare the 
> final dv/v ; to compare properly I have to test quite a number of 
> parameters.
> 
> My current idea is to run a set of possible parameters, for different 
> steps. This would lead to a large number of branches in a large tree, 
> but it would definitively be quite interesting.
> 
> I am really not in favor of duplicating the database, rather to create  
> a "config" file with an caller script, to set/change/ parameters... 
> Theoretically, the API should let you do all the actions. The only thing 
> that would be a little trickier is to store/reuse the results of each 
> step in order to compare them. For info, using the "shutil" module you 
> can move/copy files easily.
> 
> Let's keep brainstorming on that and see how it goes !
> 
> Cheers
> 
> Thomas
> 
> On 01/05/2016 16:52, Lukas Preiswerk wrote:
>> Hi all
>> 
>> I was in a similar situation as Phil, and I used (1). It?s not
>> straightforward to copy the database and make msnoise work again in a new
>> directory. But it?s definitely possible.
>> I actually think it would be a nice addition to msnoise to not only make an
>> option for multiple filters, but also for multiple other parameters (window
>> lengths, overlaps, windsorizing, etc.). This would really help in the first
>> ?exploratory phase? to find out what is the best way to process your
>> dataset.
>> What do you think of this idea? Practically I would implement it by moving
>> these parameters (window length etc.) to the filter parameters, and treat
>> it in the same way as an additional filter. As far as I understand the
>> code, this wouldn?t require many adaptions?
>> 
>> Lukas
>> 
>> 
>> 
>> 2016-05-01 11:35 GMT+02:00 Thomas Lecocq <Thomas.Lecocq at seismology.be>:
>> 
>>> Hi Phil,
>>> 
>>> I'd say (3) would be better indeed. You can script msnoise using the api.
>>> If you need to change params in the config, you can alternatively use the
>>> "msnoise config --set name=value" command.
>>> 
>>> Please keep me updated of your progresses & tests !
>>> 
>>> Thomas
>>> 
>>> 
>>> 
>>> On 01/05/2016 10:34, Phil Cummins wrote:
>>> 
>>>> Hi again,
>>>> 
>>>> As some of you may recall, I'm just getting started with msnoise. I have
>>>> a large database and have managed to get my station and data availability
>>>> tables populated.
>>>> At this point, rather than running through the whole database, processing
>>>> it with parameters I hope might work, I'd rather process small subsets,
>>>> e.g. 1 day at a time, to experiment with window lengths, overlaps, etc., to
>>>> find what seems optimal. My question is, what's the best way to process
>>>> subsets of my database?
>>>> It seems to me I have several options:
>>>>     (1) Make separate databases for each subset I want to test, and run
>>>> through the workflow on each
>>>>     (2) Set start and end times appropriate for my subset, re-scan and
>>>> run through the workflow.
>>>>     (3) Populate the jobs table, and write a script to activate only the
>>>> jobs I want and not the others.
>>>> I want to a fair bit of testing using different parameters before I run
>>>> through the whole thing, so I think (3) may be best. But any advice would
>>>> be appreciated.
>>>> Regards,
>>>> 
>>>> - Phil
>>>> _______________________________________________
>>>> MSNoise mailing list
>>>> MSNoise at mailman-as.oma.be
>>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>>> 
>>> _______________________________________________
>>> MSNoise mailing list
>>> MSNoise at mailman-as.oma.be
>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>> 
>> _______________________________________________
>> MSNoise mailing list
>> MSNoise at mailman-as.oma.be
>> http://mailman-as.oma.be/mailman/listinfo/msnoise
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 2 May 2016 17:41:20 +1000
> From: Phil Cummins <phil.cummins at anu.edu.au>
> To: Python Package for Monitoring Seismic Velocity Changes using
> 	Ambient	Seismic Noise <msnoise at mailman-as.oma.be>
> Subject: Re: [MSNoise] advice on processing database subsets
> Message-ID: <572704A0.1030608 at anu.edu.au>
> Content-Type: text/plain; charset="UTF-8"; format=flowed
> 
> Hi again,
> 
> Thanks for the comments. Here's what I did to set just a singe day for 
> processing, so that I can test the parameter settings. I looked into the 
> API code and needed to import from msnoise_table_def.py, but it seems to 
> work OK:
> 
> from msnoise.api import connect
> from msnoise_table_def import Job
> 
> set_day = '2013-10-14'
> jobtype = 'CC'
> session = connect()
> jobs_set   = session.query(Job).filter(Job.jobtype == 
> jobtype).filter(Job.day == set_day)
> jobs_set.update({Job.flag: 'T'})
> jobs_unset = session.query(Job).filter(Job.jobtype == 
> jobtype).filter(Job.day != set_day)
> jobs_unset.update({Job.flag: 'D'})
> session.commit()
> 
> So now I have a jobs table with just the day I want set to 'T'. I hoped 
> I was ready to try 'msnoise compute_cc', but it seems to want me to set 
> Filters first. This appears to be referring to the MCWS filter 
> parameters? I am a little surprised since I thought MCWS would come 
> later, and don't understand how the CC computation would be dependent on 
> the MCWS filter parameters.
> 
> To tell you the truth, at the moment I am more interested in using the 
> msnoise cross-correlations as input to a tomography algorithm, rather 
> than in MCWS itself. In any case I am keen to look at the CC to see that 
> they make sense, before i move to anything else.
> 
> Would it be possible to please advise on whether there is a way to do 
> compute_cc without having to worry about the MCWS parameters?
> 
> Thanks,
> 
> - Phil
> 
> 
> Thomas Lecocq wrote:
>> Hi guys,
>> 
>> Yeah, I have been thinking about a "benchmark" mode for quite a number 
>> of weeks, i.e. since I tested a first run of PWS in order to compare 
>> the final dv/v ; to compare properly I have to test quite a number of 
>> parameters.
>> 
>> My current idea is to run a set of possible parameters, for different 
>> steps. This would lead to a large number of branches in a large tree, 
>> but it would definitively be quite interesting.
>> 
>> I am really not in favor of duplicating the database, rather to 
>> create  a "config" file with an caller script, to set/change/ 
>> parameters... Theoretically, the API should let you do all the 
>> actions. The only thing that would be a little trickier is to 
>> store/reuse the results of each step in order to compare them. For 
>> info, using the "shutil" module you can move/copy files easily.
>> 
>> Let's keep brainstorming on that and see how it goes !
>> 
>> Cheers
>> 
>> Thomas
>> 
>> On 01/05/2016 16:52, Lukas Preiswerk wrote:
>>> Hi all
>>> 
>>> I was in a similar situation as Phil, and I used (1). It?s not
>>> straightforward to copy the database and make msnoise work again in a 
>>> new
>>> directory. But it?s definitely possible.
>>> I actually think it would be a nice addition to msnoise to not only 
>>> make an
>>> option for multiple filters, but also for multiple other parameters 
>>> (window
>>> lengths, overlaps, windsorizing, etc.). This would really help in the 
>>> first
>>> ?exploratory phase? to find out what is the best way to process your
>>> dataset.
>>> What do you think of this idea? Practically I would implement it by 
>>> moving
>>> these parameters (window length etc.) to the filter parameters, and 
>>> treat
>>> it in the same way as an additional filter. As far as I understand the
>>> code, this wouldn?t require many adaptions?
>>> 
>>> Lukas
>>> 
>>> 
>>> 
>>> 2016-05-01 11:35 GMT+02:00 Thomas Lecocq <Thomas.Lecocq at seismology.be>:
>>> 
>>>> Hi Phil,
>>>> 
>>>> I'd say (3) would be better indeed. You can script msnoise using the 
>>>> api.
>>>> If you need to change params in the config, you can alternatively 
>>>> use the
>>>> "msnoise config --set name=value" command.
>>>> 
>>>> Please keep me updated of your progresses & tests !
>>>> 
>>>> Thomas
>>>> 
>>>> 
>>>> 
>>>> On 01/05/2016 10:34, Phil Cummins wrote:
>>>> 
>>>>> Hi again,
>>>>> 
>>>>> As some of you may recall, I'm just getting started with msnoise. I 
>>>>> have
>>>>> a large database and have managed to get my station and data 
>>>>> availability
>>>>> tables populated.
>>>>> At this point, rather than running through the whole database, 
>>>>> processing
>>>>> it with parameters I hope might work, I'd rather process small 
>>>>> subsets,
>>>>> e.g. 1 day at a time, to experiment with window lengths, overlaps, 
>>>>> etc., to
>>>>> find what seems optimal. My question is, what's the best way to 
>>>>> process
>>>>> subsets of my database?
>>>>> It seems to me I have several options:
>>>>>     (1) Make separate databases for each subset I want to test, 
>>>>> and run
>>>>> through the workflow on each
>>>>>     (2) Set start and end times appropriate for my subset, re-scan 
>>>>> and
>>>>> run through the workflow.
>>>>>     (3) Populate the jobs table, and write a script to activate 
>>>>> only the
>>>>> jobs I want and not the others.
>>>>> I want to a fair bit of testing using different parameters before I 
>>>>> run
>>>>> through the whole thing, so I think (3) may be best. But any advice 
>>>>> would
>>>>> be appreciated.
>>>>> Regards,
>>>>> 
>>>>> - Phil
>>>>> _______________________________________________
>>>>> MSNoise mailing list
>>>>> MSNoise at mailman-as.oma.be
>>>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>>>> 
>>>> _______________________________________________
>>>> MSNoise mailing list
>>>> MSNoise at mailman-as.oma.be
>>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>>> 
>>> _______________________________________________
>>> MSNoise mailing list
>>> MSNoise at mailman-as.oma.be
>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>> 
>> _______________________________________________
>> MSNoise mailing list
>> MSNoise at mailman-as.oma.be
>> http://mailman-as.oma.be/mailman/listinfo/msnoise
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Mon, 2 May 2016 09:44:38 +0200
> From: Thomas Lecocq <thomas.lecocq at oma.be>
> To: msnoise at mailman-as.oma.be
> Subject: Re: [MSNoise] advice on processing database subsets
> Message-ID: <57270566.40602 at oma.be>
> Content-Type: text/plain; charset=utf-8; format=flowed
> 
> Hi Phil,
> 
> Nice piece of code.
> 
> Currently, a Filter is defined for AND cc step (whitening) AND mwcs 
> step. So, you'll have to define the filter's bounds for the CC step, 
> while keeping the MWCS values to 0 , e.g. setting "low", "high", 
> "rms_threshold=0", "used=true" and you'll be good to go !
> 
> Thomas
> 
> Le 02/05/2016 09:41, Phil Cummins a ?crit :
>> Hi again,
>> 
>> Thanks for the comments. Here's what I did to set just a singe day for 
>> processing, so that I can test the parameter settings. I looked into 
>> the API code and needed to import from msnoise_table_def.py, but it 
>> seems to work OK:
>> 
>> from msnoise.api import connect
>> from msnoise_table_def import Job
>> 
>> set_day = '2013-10-14'
>> jobtype = 'CC'
>> session = connect()
>> jobs_set   = session.query(Job).filter(Job.jobtype == 
>> jobtype).filter(Job.day == set_day)
>> jobs_set.update({Job.flag: 'T'})
>> jobs_unset = session.query(Job).filter(Job.jobtype == 
>> jobtype).filter(Job.day != set_day)
>> jobs_unset.update({Job.flag: 'D'})
>> session.commit()
>> 
>> So now I have a jobs table with just the day I want set to 'T'. I 
>> hoped I was ready to try 'msnoise compute_cc', but it seems to want me 
>> to set Filters first. This appears to be referring to the MCWS filter 
>> parameters? I am a little surprised since I thought MCWS would come 
>> later, and don't understand how the CC computation would be dependent 
>> on the MCWS filter parameters.
>> 
>> To tell you the truth, at the moment I am more interested in using the 
>> msnoise cross-correlations as input to a tomography algorithm, rather 
>> than in MCWS itself. In any case I am keen to look at the CC to see 
>> that they make sense, before i move to anything else.
>> 
>> Would it be possible to please advise on whether there is a way to do 
>> compute_cc without having to worry about the MCWS parameters?
>> 
>> Thanks,
>> 
>> - Phil
>> 
>> 
>> Thomas Lecocq wrote:
>>> Hi guys,
>>> 
>>> Yeah, I have been thinking about a "benchmark" mode for quite a 
>>> number of weeks, i.e. since I tested a first run of PWS in order to 
>>> compare the final dv/v ; to compare properly I have to test quite a 
>>> number of parameters.
>>> 
>>> My current idea is to run a set of possible parameters, for different 
>>> steps. This would lead to a large number of branches in a large tree, 
>>> but it would definitively be quite interesting.
>>> 
>>> I am really not in favor of duplicating the database, rather to 
>>> create  a "config" file with an caller script, to set/change/ 
>>> parameters... Theoretically, the API should let you do all the 
>>> actions. The only thing that would be a little trickier is to 
>>> store/reuse the results of each step in order to compare them. For 
>>> info, using the "shutil" module you can move/copy files easily.
>>> 
>>> Let's keep brainstorming on that and see how it goes !
>>> 
>>> Cheers
>>> 
>>> Thomas
>>> 
>>> On 01/05/2016 16:52, Lukas Preiswerk wrote:
>>>> Hi all
>>>> 
>>>> I was in a similar situation as Phil, and I used (1). It?s not
>>>> straightforward to copy the database and make msnoise work again in 
>>>> a new
>>>> directory. But it?s definitely possible.
>>>> I actually think it would be a nice addition to msnoise to not only 
>>>> make an
>>>> option for multiple filters, but also for multiple other parameters 
>>>> (window
>>>> lengths, overlaps, windsorizing, etc.). This would really help in 
>>>> the first
>>>> ?exploratory phase? to find out what is the best way to process your
>>>> dataset.
>>>> What do you think of this idea? Practically I would implement it by 
>>>> moving
>>>> these parameters (window length etc.) to the filter parameters, and 
>>>> treat
>>>> it in the same way as an additional filter. As far as I understand the
>>>> code, this wouldn?t require many adaptions?
>>>> 
>>>> Lukas
>>>> 
>>>> 
>>>> 
>>>> 2016-05-01 11:35 GMT+02:00 Thomas Lecocq <Thomas.Lecocq at seismology.be>:
>>>> 
>>>>> Hi Phil,
>>>>> 
>>>>> I'd say (3) would be better indeed. You can script msnoise using 
>>>>> the api.
>>>>> If you need to change params in the config, you can alternatively 
>>>>> use the
>>>>> "msnoise config --set name=value" command.
>>>>> 
>>>>> Please keep me updated of your progresses & tests !
>>>>> 
>>>>> Thomas
>>>>> 
>>>>> 
>>>>> 
>>>>> On 01/05/2016 10:34, Phil Cummins wrote:
>>>>> 
>>>>>> Hi again,
>>>>>> 
>>>>>> As some of you may recall, I'm just getting started with msnoise. 
>>>>>> I have
>>>>>> a large database and have managed to get my station and data 
>>>>>> availability
>>>>>> tables populated.
>>>>>> At this point, rather than running through the whole database, 
>>>>>> processing
>>>>>> it with parameters I hope might work, I'd rather process small 
>>>>>> subsets,
>>>>>> e.g. 1 day at a time, to experiment with window lengths, overlaps, 
>>>>>> etc., to
>>>>>> find what seems optimal. My question is, what's the best way to 
>>>>>> process
>>>>>> subsets of my database?
>>>>>> It seems to me I have several options:
>>>>>>     (1) Make separate databases for each subset I want to test, 
>>>>>> and run
>>>>>> through the workflow on each
>>>>>>     (2) Set start and end times appropriate for my subset, 
>>>>>> re-scan and
>>>>>> run through the workflow.
>>>>>>     (3) Populate the jobs table, and write a script to activate 
>>>>>> only the
>>>>>> jobs I want and not the others.
>>>>>> I want to a fair bit of testing using different parameters before 
>>>>>> I run
>>>>>> through the whole thing, so I think (3) may be best. But any 
>>>>>> advice would
>>>>>> be appreciated.
>>>>>> Regards,
>>>>>> 
>>>>>> - Phil
>>>>>> _______________________________________________
>>>>>> MSNoise mailing list
>>>>>> MSNoise at mailman-as.oma.be
>>>>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>>>>> 
>>>>> _______________________________________________
>>>>> MSNoise mailing list
>>>>> MSNoise at mailman-as.oma.be
>>>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>>>> 
>>>> _______________________________________________
>>>> MSNoise mailing list
>>>> MSNoise at mailman-as.oma.be
>>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>>> 
>>> _______________________________________________
>>> MSNoise mailing list
>>> MSNoise at mailman-as.oma.be
>>> http://mailman-as.oma.be/mailman/listinfo/msnoise
>> _______________________________________________
>> MSNoise mailing list
>> MSNoise at mailman-as.oma.be
>> http://mailman-as.oma.be/mailman/listinfo/msnoise
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> MSNoise mailing list
> MSNoise at mailman-as.oma.be
> http://mailman-as.oma.be/mailman/listinfo/msnoise
> 
> 
> End of MSNoise Digest, Vol 27, Issue 2
> **************************************