From fblackburn at ppg.com Thu May 23 04:49:32 2019
From: fblackburn at ppg.com (Blackburn, Forrest R.)
Date: Thu, 23 May 2019 11:49:32 +0000


Subject: <EXT>Re: Working with large volumes of data in Igor
In-Reply-To: <33f5b1fe-fbac-bd9f-4be7-2d72b4e78e4a@virtuell-zuhause.de>
References: <CAGjoBTUaSjy7z6OaAv4AXXt=+Jaj9bQ_uff63GMFN84nG0dm8w@mail.gmail.com>
<33f5b1fe-fbac-bd9f-4be7-2d72b4e78e4a@virtuell-zuhause.de>
Message-ID: <d24fcae15c354a59be37cfa57a2b35c7@ppg.com>

To follow up:
Control Panel\All Control Panel Items\System
Advanced System Settings
Advanced Tab
Performance Settings
Adjust for best performance

Your interface will be ugly, but you will be amazed at the boost in performance out of Igor Pro and everything else.
Note that you don't get the same level of boost out of IP7, but the changes in IP8 "knocked my socks off".

I would assume you can do something comparable on a Mac.



Thanks,
Forrest

Forrest Blackburn
Sr. Research Associate
PPG
Photochromics R&D
Monroeville Business & Technical Center

440 College Park Drive
Monroeville, PA 15146
Phone: 724-325-5851
Mobile: 724-331-7292
Fax: 725-325-5225
E-Mail: fblackburn at ppg.com
Web: www.ppg.com



-----Original Message-----
From: Info-igor info-igor-bounces at lists.info-igor.org On Behalf Of Thomas Braun
Sent: Thursday, May 23, 2019 5:51 AM
To: info-igor at lists.info-igor.org
Subject: <EXT>Re: Working with large volumes of data in Igor

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


Am 22.05.2019 um 00:22 schrieb Albert Aumentado:

Hi Albert,

> This is a somewhat broad question but I wonder if anyone in the Igor
> community has been routinely working with large volumes of data in Igor. I
> am mostly concerned with speeding up processes in situations involving
> large numbers of waves or traces.
>
> The type of scenario I am speaking about is working with 50000 waves or
> having 10000 traces in a graph. My previous experience was working with
> maybe 100x less objects. As I scaled-up the amount of data to analyze, I
> realized a lot of my routines and the visualization tools were starting to
> lag. I have gone through some O-notation analysis as well as the function
> profiler but reach the end of my current programming knowledge.

we are working with lots of traces (~5k) in IP8 on windows.

Getting this performant is tricky, a couple of things to consider
(thanks to WM support for insight as well):
- Don't access a single trace when building up the graph. This is
currently quite expensive to do. Just use AppendToGraph
- Make the trace names maximally different. I'm using a counter to name
them like "T1...", "T2..." this speeds up accessing them (if you have to)
- Only plot a subset of the data "Display wave[0,inf:16]" would only
plot every 16th point. Depending on your data this might give you a
false impression of the data though.
- Check if you really need to use double as data type for the waves
shown. Maybe float is enough as well.
- If you are comfortable with just lines style in the graph and no
opacity, look into the live mode flag of Display

We currently don't have to deal with a huge number of waves, but some
general things to consider:
- The problems with accessing waves from datafolders with lots of waves
are gone since IP7. At [1] I've posted a code snippet and a graph which
shows the access times of 10k waves having 100k waves in one datafolder.
These are usually around 7e-6s.
- Same double vs float advice from above.
- Does your analysis code has more than linear complexity with respect
to the number of waves?
- Use separate threads for crunching the data. We are using a wrapper
[2] around the igor threading tools which does allow you to skip the low
level fiddling. I can share the code if you are interested.
- Use free waves for analysis code if possible. Although adding a global
wave is quite cheap nowadays it still has some cost.
- Depending on the expression either Multithread statements or matrixOP
is faster. This also depends on the number of cores your CPU has.
- Regarding speeding up calculations the index waves introduced in IP8
can help to avoid even more explicit for loops. See DisplayHelpTopic
"Indexing with an index wave".
- For function profiling you can also use BeginFunctionProfiling() and
EndFunctionProfiling() so you don't have to use the panel.

>From your earlier question on the mailing list I presume you are
fetching the data from a database using the SQL XOP? Are you always
fetching it again or do you cache the waves in the experiment?

How many waves do you have in the experiment at a time? Is it 50k in
total or per datafolder?

Thomas

[1]: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.wavemetrics.com_node_20924&d=DwICAg&c=zwPfxOisiRWGoGc9Pq9Wrg&r=DJh7i5PWLV5Dm2Wwf7sbEovOR5VSXMYwaKoVrVH2__w&m=7dzC1eiS9Ck5cVuRwAcfI8Jebz1NCB-JgZCSGxvetG8&s=OIIjHWwYyG-aV3umOS0NHsuxELI4zvfH3eL8e7bIsQI&e[2]: https://urldefense.proofpoint.com/v2/url?u=https-3A__alleninstitute.github.io_MIES_asyncframework.html&d=DwICAg&c=zwPfxOisiRWGoGc9Pq9Wrg&r=DJh7i5PWLV5Dm2Wwf7sbEovOR5VSXMYwaKoVrVH2__w&m=7dzC1eiS9Ck5cVuRwAcfI8Jebz1NCB-JgZCSGxvetG8&s=l6QWlz_4pCSzrfAEdmiiehW-1O8twPtxXx2FQGPtWT8&e_______________________________________________
Info-igor mailing list
Info-igor at lists.info-igor.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.info-2Digor.org_listinfo.cgi_info-2Digor-2Dinfo-2Digor.org&d=DwICAg&c=zwPfxOisiRWGoGc9Pq9Wrg&r=DJh7i5PWLV5Dm2Wwf7sbEovOR5VSXMYwaKoVrVH2__w&m=7dzC1eiS9Ck5cVuRwAcfI8Jebz1NCB-JgZCSGxvetG8&s=dR7NADEa-3M4fJ2LRgk7LqTZvHeJjUUnaPhEI6Tp8to&e
________________________________

CONFIDENTIALITY NOTICE:
This email (including any attachments) is intended for the sole use of the intended recipient/s and may contain confidential information, which also may be legally privileged. Any reliance upon, access to, review, disclosure, copying, forwarding or other distribution of any or all of the contents in this message by others who are not the intended recipients is STRICTLY PROHIBITED. If you are not the intended recipient, please delete the message and all copies and confirm to the sender by email. Your cooperation is appreciated.