Subject: <EXT>Re: Working with large volumes of data in Igor
In-Reply-To: <33f5b1fe-fbac-bd9f-4be7-2d72b4e78e4a@virtuell-zuhause.de>
References: <CAGjoBTUaSjy7z6OaAv4AXXt=+Jaj9bQ_uff63GMFN84nG0dm8w@mail.gmail.com>,
<33f5b1fe-fbac-bd9f-4be7-2d72b4e78e4a@virtuell-zuhause.de>
Message-ID: <59F1E233-E589-40E3-B9A1-FC8E06F95FDD@ppg.com>
If using windows 10, change windows to reduce all the ?eye candy?. Basically, set up windows for performance. Your interface will look like windows xp, but the performance boost is tremendous.
It is amazing how much the stupid OS eye candy causes your system to chug.
I forget exactly where the settings change is found...
Thanks.
Forrest
Sent from my iPhone
Forrest Blackburn
Sr. Research Associate
PPG
Photochromics R&D
Monroeville Business & Technical Center
440 College Park Drive
Monroeville, PA 15146
724-325-5851
724-331-7292 (mobile)
fblackburn at ppg.com
> On May 23, 2019, at 5:52 AM, Thomas Braun <thomas.braun at virtuell-zuhause.de> wrote:
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> Am 22.05.2019 um 00:22 schrieb Albert Aumentado:
>
> Hi Albert,
>
>> This is a somewhat broad question but I wonder if anyone in the Igor
>> community has been routinely working with large volumes of data in Igor. I
>> am mostly concerned with speeding up processes in situations involving
>> large numbers of waves or traces.
>>
>> The type of scenario I am speaking about is working with 50000 waves or
>> having 10000 traces in a graph. My previous experience was working with
>> maybe 100x less objects. As I scaled-up the amount of data to analyze, I
>> realized a lot of my routines and the visualization tools were starting to
>> lag. I have gone through some O-notation analysis as well as the function
>> profiler but reach the end of my current programming knowledge.
>
> we are working with lots of traces (~5k) in IP8 on windows.
>
> Getting this performant is tricky, a couple of things to consider
> (thanks to WM support for insight as well):
> - Don't access a single trace when building up the graph. This is
> currently quite expensive to do. Just use AppendToGraph
> - Make the trace names maximally different. I'm using a counter to name
> them like "T1...", "T2..." this speeds up accessing them (if you have to)
> - Only plot a subset of the data "Display wave[0,inf:16]" would only
> plot every 16th point. Depending on your data this might give you a
> false impression of the data though.
> - Check if you really need to use double as data type for the waves
> shown. Maybe float is enough as well.
> - If you are comfortable with just lines style in the graph and no
> opacity, look into the live mode flag of Display
>
> We currently don't have to deal with a huge number of waves, but some
> general things to consider:
> - The problems with accessing waves from datafolders with lots of waves
> are gone since IP7. At [1] I've posted a code snippet and a graph which
> shows the access times of 10k waves having 100k waves in one datafolder.
> These are usually around 7e-6s.
> - Same double vs float advice from above.
> - Does your analysis code has more than linear complexity with respect
> to the number of waves?
> - Use separate threads for crunching the data. We are using a wrapper
> [2] around the igor threading tools which does allow you to skip the low
> level fiddling. I can share the code if you are interested.
> - Use free waves for analysis code if possible. Although adding a global
> wave is quite cheap nowadays it still has some cost.
> - Depending on the expression either Multithread statements or matrixOP
> is faster. This also depends on the number of cores your CPU has.
> - Regarding speeding up calculations the index waves introduced in IP8
> can help to avoid even more explicit for loops. See DisplayHelpTopic
> "Indexing with an index wave".
> - For function profiling you can also use BeginFunctionProfiling() and
> EndFunctionProfiling() so you don't have to use the panel.
>
> From your earlier question on the mailing list I presume you are
> fetching the data from a database using the SQL XOP? Are you always
> fetching it again or do you cache the waves in the experiment?
>
> How many waves do you have in the experiment at a time? Is it 50k in
> total or per datafolder?
>
> Thomas
>
> [1]: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.wavemetrics.com_node_20924&d=DwICAg&c=zwPfxOisiRWGoGc9Pq9Wrg&r=DJh7i5PWLV5Dm2Wwf7sbEovOR5VSXMYwaKoVrVH2__w&m=7dzC1eiS9Ck5cVuRwAcfI8Jebz1NCB-JgZCSGxvetG8&s=OIIjHWwYyG-aV3umOS0NHsuxELI4zvfH3eL8e7bIsQI&e> [2]: https://urldefense.proofpoint.com/v2/url?u=https-3A__alleninstitute.github.io_MIES_asyncframework.html&d=DwICAg&c=zwPfxOisiRWGoGc9Pq9Wrg&r=DJh7i5PWLV5Dm2Wwf7sbEovOR5VSXMYwaKoVrVH2__w&m=7dzC1eiS9Ck5cVuRwAcfI8Jebz1NCB-JgZCSGxvetG8&s=l6QWlz_4pCSzrfAEdmiiehW-1O8twPtxXx2FQGPtWT8&e> _______________________________________________
> Info-igor mailing list
> Info-igor at lists.info-igor.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.info-2Digor.org_listinfo.cgi_info-2Digor-2Dinfo-2Digor.org&d=DwICAg&c=zwPfxOisiRWGoGc9Pq9Wrg&r=DJh7i5PWLV5Dm2Wwf7sbEovOR5VSXMYwaKoVrVH2__w&m=7dzC1eiS9Ck5cVuRwAcfI8Jebz1NCB-JgZCSGxvetG8&s=dR7NADEa-3M4fJ2LRgk7LqTZvHeJjUUnaPhEI6Tp8to&e
________________________________
CONFIDENTIALITY NOTICE:
This email (including any attachments) is intended for the sole use of the intended recipient/s and may contain confidential information, which also may be legally privileged. Any reliance upon, access to, review, disclosure, copying, forwarding or other distribution of any or all of the contents in this message by others who are not the intended recipients is STRICTLY PROHIBITED. If you are not the intended recipient, please delete the message and all copies and confirm to the sender by email. Your cooperation is appreciated.