How to speed up reg...
 
Share:
Notifications
Clear all

Mar 28 2026 APP 2.0.0-beta40 will be released in 7 days.

It did take a long time to have the work finished on this and it  will have a major performance boost of 30-50% over 2.0.0-beta39 from calibration to integration. We extensively optimized many critical parts of APP. All has been tested to guarantee correct optimizations. Drizzle and image resampling is much faster for instance, those modules have been completely rewritten. Much less memory usage. LNC 2.0 will be released which works much better and faster than LNC in it's current state. And more, all will be added to the release notes in the coming weeks...

Update on the 2.0.0 release & the full manual

We are getting close to the 2.0.0 stable release and the full manual. The manual will soon become available on the website and also in PDF format. Both versions will be identical and once released, will start to follow the APP release cycle and thus will stay up-to-date to the latest APP version.

Once 2.0.0 is released, the price for APP will increase. Owner's license holders will not need to pay an upgrade fee to use 2.0.0, neither do Renter's license holders.

 

How to speed up registration-process on large mosaics?

57 Posts
6 Users
10 Reactions
4,584 Views
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

I am currently building a computer dedicated to APP for large mosaics. As I am not that rich I am using old server-parts because these can take a lot of memory modules and still have a lot of processing power.
The computing-part that consumes the most time on my MacStudio with M1 Ultra is the registration-process - it takes about 10-12 hours for registering 109 tiles with 20 Megapixels each (still adding tiles). But this part seems not to be a very CPU-intense workload.
What would be ncessary to speed up the registration-process?
Faster SSD? More memory? More CPU-cores?
Thank you!



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Since all support has now vanished into thin air, I have to find out for myself.

I'm currently testing two configurations to find out what affects which process and in what way.

Config 1 is an Apple MacStudio
M1 Ultra (2022), 64GB RAM (20-core CPU, 5nm, max. 3.2GHz, 800GB/s bandwith), MacOS Sonoma 14.5.

Config 2 is an older 1st gen. dual AMD EPYC System (2017):
2x AMD EPYC 7551 (2x 32 Core, 14nm, max. 3GHz, 176GB/s bandwith), 512 BG RAM (actually 480GB - one DIMM is damaged), Microsoft Windows 10 Pro.

Test-szenario 1
Stack one OSC-session at default settings with 119 fits-images, 20 megapixels.

Test-szenario 2:
Make a mosaic out of 115 finished fits-images, round about 20 megapixel per image.

These two tests have very different requirements, what will we learn?

So far I have the results from the Test-szenario 1 -> time until an image is displayed on the screen:

Apple M1 Ultra 64GB: 6min 38s 

Dual-EPYC 7551 480GB: 10min 15s
Dual-EPYC 7551 480GB with SMT turned off in BIOS: 9min 56s

So, nothing seems to beat the Apple Silicon here - the M1 has no SMT or Hyperthreading, and this scenario would have no benefit from SMT/Hyperthreading. Most of the work done here are intense load and write operations with short computing times.
For this kind of task you would simply love the Apple Silicon for it's fast SSD and huge bandwith, also the clockspeed, because in Windows 10 the EPYC processor allways seems to use all-core boost, and this results in a clockspeed of only 2.52GHz.

Test-szenario 2 with the huge mosaic is currently ongoing, but this will maybe take some days, I have to repeat the test on the EPYC-system, there was a Java-ERROR at the end of the integration process.

But, what I have seen until now: The extensively long registration-process of this mosaic seems to gain a huge benefit from the number of CPUs with enabled SMT, this seems to speed up registration substantially.
During the Integrationprocess there are short peaks in memory-usage, where APP utilizes more than 106 GB of RAM.

The Mac seems not to be faster in this test at first impression, results will follow. But one thing on the Mac is really bad here: The resulting image in the first run always has artifacts, I have to run the integration-process at least three times for an image without stripes.

More infos and results will follow soon.


This post was modified 2 years ago by Walter Leonhard Schramböck

   
ReplyQuote
(@dheyergmail-com)
Main Sequence Star
Joined: 5 years ago
Posts: 22
 

Curious if you stacked each individual panel first? The suggested way to build mosaics in APP is to first stack the individual panels and then build the mosaic.
On any computer the more cores(threads) and RAM the better. My dream machine would be one of the AMD Threadripper Pro with 192 threads and up to 2TB of ram, but it is a little expensive. I will have to make do with my 8th gen i7 for awhile.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Yes, I did not mention this. Of course every single image was individually stacked, cropped and light-pollution corrected for the normalization process to run smoothly and give a nice and even background. Geathering these 115 sessions took three years and is ongoing.

My dream machine would be something like you mentioned, but I can't afford it right now, and all I need is something that is capable of rendering my growing msoaic now and in the coming two years.

A big core-count is not always what gives you benefit, especially in gaming and software that does multiple I/O operations that can not be prefetched or parallelisized.


This post was modified 2 years ago 2 times by Walter Leonhard Schramböck

   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Now, here I am again at a point where I would need support.
The Mac always completes the 115-tile mosaic, the Windows-machine does not, it ends the integration-process with a Java-error (some "out of boundaries" message) instead of putting out an image, both have identical settings and images.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Since the main difference between these two configurations is that the EPYC-system is used over remotedektop with no graphicscard, I added a GPU now. Also I installed the Java 8 Update 411.
Running the test again.

The Error occures at the end of the integration-process, just before the image-output would start, so for the benchmark the test is meaningful enough.

 

Under these conditions the results for test-scenario 2 would be:

Apple M1 Ultra 64GB: 15h 55min

Dual EPYC 7551 480GB: 11h 25min

With SMT disabled the performance of the old EPYC is nearly 100% equal to the M1 Ultra.


This post was modified 2 years ago by Walter Leonhard Schramböck

   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

After BIOS-update and running APP as administrator, no changes, no resulting mosaik-image on the EPYC system.
So, I do it one more time, now without remote-desktop, just direct man-to-machine contact.

If that fails again I will install Linux and start over testing.



   
ReplyQuote
(@dheyergmail-com)
Main Sequence Star
Joined: 5 years ago
Posts: 22
 

What camera was used for the imaging and what scale setting are you using? 115 panel mosaic requires A LOT of ram to process 32bit images. For testing you can set the scale at .3 to and see how big you can go before the error starts happening. Also for testing you can make 16bit copies of the panels. A while back I chatted with the person that make a 3.5 gigapixel mosaic of the Milky Way and he ended up renting a computer powerful enough to process such a big mosaic.
Another person I talked to that did a really big mosaic broke the mosaic up by building 2X2 panels. He never had more than 4 panels to stitch together.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Most of your questions can be answered by reading my posts.
So, if the Mac with 64 GB RAM is able to render it, the EPYC with 480 GB RAM should easily do it.

I have now installed Linux to see how this goes. Test is running right now with Linux Mint.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Well, test2 under Linux did end with the same Java-error, so my MacStudio is so far the only machine that can finish this task.

So I will end this interesting and timeconsuming journey, sell the parts of the EPYC-machine and buy a better equipped MacStudio as soon as I can afford it.

I wrote three emails to the support in the last days, not one was answered until now. 🤔 



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

One thing I found that could cause trouble:
BIOS was set to Legacy-mode, this could be a problem with large diskspace, and my APP-working-directory is on a 4TB NVMe.
So I am trying this now with BIOS in UEFI-mode.
I have no big hope this could be the solution, but who knows.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

So, this was not the solution. 
I added now the last missing DIMM to make all 16 slots full. No change.

I now set the scale of saving the registered frames at 0.5, just to see if this will result in an image-output at the end. But that would not make any sense for the purpose of this machine and make the EPYC-build completely redundant. It very likely is.

Still nobody answering the support email.


This post was modified 2 years ago 2 times by Walter Leonhard Schramböck

   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

These Error-message is given on the EPYC-machine, no matter if Windows or Linux installed.
And this error always comes at the end of the integration-process, only on this computer.

Bildschirmfoto 2024 06 04 um 18.18.30
Bildschirmfoto 2024 06 04 um 18.19.01

 



   
ReplyQuote
(@dheyergmail-com)
Main Sequence Star
Joined: 5 years ago
Posts: 22
 

This might have to do with the Java HEAP size. Some programs written in Java this can be changed in the config file but don't see a way to change it in APP program folder.

I saw in this forum that support is currently really slow because of ongoing cancer treatments. The developer is usually very quick to help out but probably physically unable too right now.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

But why trouble only on this machine and not on the mac?

Yes, I knew about the moderator's health issue, but I thought there might have been more than just one person giving support... ?



   
ReplyQuote
(@dheyergmail-com)
Main Sequence Star
Joined: 5 years ago
Posts: 22
 

I wish knew. I wonder if the error would happen on an Intel Mac. Memory handling is much different on M chip Macs.

 



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

What APP version was it when Mabula migrated to a newer Java version? 



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

The value 2147483684 is a 32-bit boundary for an array. 
The integration-process "knows" about the size-limits and scales down the images before integrating.
So, something goes wrong here, seems like a bug to me.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

I am building an old XEON machine now, just to test if this problem occurs on other configurations, too. But this could take some days.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

I had the idea to ask a code-trained AI-model what to do about this problem.
The answer however is not helpful for someone who has no access to the code...

This is what codellama had to say:

To solve this problem, you need to ensure that your code does not attempt 

to access an array with a negative index. This can be done by checking the

index against the length of the array before attempting to access it. 

Here's an example of how you can modify your code to do this:

```

int[] myArray = new int[10];

 

// Check the index against the length of the array

if (index >= 0 && index < myArray.length) {

    // Access the element at the specified index

    System.out.println(myArray[index]);

} else {

    // Handle the error condition

    System.err.println("Index out of bounds: " + index);

}

```

By checking the index against the length of the array, you can ensure that

your code does not attempt to access an array with a negative index. If 

the index is outside the bounds of the array, the code will handle the 

error condition by printing an error message.


This post was modified 2 years ago by Walter Leonhard Schramböck

   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

And this is, what ChatGPT has to say:

The error message you are receiving, java.lang.ArrayIndexOutOfBoundsException: Index -2147483647 out of bounds for length 2147014284, indicates an issue with an array access in Java. This error means that an index used is outside the valid range of the array. In this case, the index is extremely negative (-2147483647), which typically suggests a problem like an overflow or a faulty calculation.

Here are some possible causes and solutions for this issue:

  1. Memory Issues:

    • Ensure your system has enough RAM to process the 115 images. Java-based programs can be very memory-intensive, especially when processing images.
    • You can try allocating more memory to the JVM (Java Virtual Machine) by adjusting the startup parameters, e.g., -Xmx4g for 4 GB of RAM.
  2. Data Corruption:

    • It’s possible that one or more of the images are corrupted. Try processing the images individually or in smaller groups to identify the problematic image.
  3. Software Bug:

    • There might be a bug in Astropixelprocessor itself. Check if you are using the latest version of the software. It can be helpful to look for updates or contact Astropixelprocessor support.
  4. JVM Version:

    • Ensure the Java version you are using is compatible with the version of Astropixelprocessor. Sometimes there are incompatibilities between different JVM versions and specific programs.
  5. Image Sizes:

    • Large image files can lead to index overflows. Make sure the image files are within the size limitations of the program.

Additional Information:

This issue only occurs on a computer with two AMD EPYC 7551 processors and 512 GB RAM, both on Windows 10 Pro and Linux Mint. On my MacStudio with an M1 Ultra and 64 GB RAM, I do not encounter this problem.

Here are some steps you can try:

  1. Adjust Startup Parameters: Increase the allocated memory for the JVM:

    bash

    java -Xmx4g -jar astropixelprocessor.jar

    (Adjust 4g according to available memory.)

  2. Process Images in Smaller Groups: Try processing the images in smaller groups to identify the problematic image.

  3. Software Update: Ensure you are using the latest version of Astropixelprocessor.

  4. Image Integrity Check: Check the integrity of the image files and ensure no files are corrupted.

  5. Environment-Specific Issues: Since the problem only occurs on the AMD EPYC system, it could be related to specific hardware or driver issues. Ensure all drivers and firmware are up to date. Additionally, try running the application with different JVM versions to see if that resolves the issue.

If these steps do not resolve the issue, providing more details about your environment (operating system, Java version, hardware specifications) and the specific settings and steps you are using in Astropixelprocessor could be helpful for further troubleshooting.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Good news!

Last night i uninstalled all Java-updates from Windows.
As I launched APP after that, it asked me about the working-directory, as if it was a fresh install.
Then I let it crunsh the mosaic, which now has 117 tiles - and it did it! Flawless!

You probably can imagine, that I now have a happy day. 😉

So, the mosaic can keep growing for some additional years.

Bildschirmfoto 2024 06 09 um 13.17.57

 


This post was modified 2 years ago 2 times by Walter Leonhard Schramböck

   
artem and JuergenN reacted
ReplyQuote
(@artem)
Neutron Star
Joined: 8 years ago
Posts: 83
 

Very nice and detailed progress of success, well done, regards Martin 😀 PS! also excited to see how the mosaic will progress in the future.. CS Martin



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Additional info:
I am now testing this mosaik on different versions of APP.
Currently tested:

APP 2.0.0_beta26: OK
APP 2.0.0_beta28: Java-Error 
APP 2.0.0_beta29: Java-Error 

So, only in beta26 this special computing environment is able to finish the big mosaic-task.

My old dual-XEON machine with 192GB RAM will soon be ready for testing (have to buy some SSDs), too. It will be much slower (12 cores) and have hard times, but I really want to know... we will see.


This post was modified 2 years ago by Walter Leonhard Schramböck

   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

The EPYC-story continues:

Since the EPYC machine was not able to deliver any positive result again with the mosaic, I took everything apart for closer inspection. And I found something.
One of the CPUs has some bad contact-pins. 
So I am running this machine with half of the previous configuration - one CPU and 256GB RAM, and if this works I will replace the second CPU.

IMG 6143 2


   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

The actual testing seems to run at equal speed as before - maybe even faster. This means that the faulty cpu has not even been involved in all the testing I have done until now. 😲 
This is exciting...


This post was modified 2 years ago by Walter Leonhard Schramböck

   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Excitement gone.
Same Java-Error in this configuration with one CPU.
Now I have to test all the RAM-Sticks, this is no fun any more.

Next week I will get a 4TB SSD for the old XEON server and run the mosaic-calculation there. I guess 32 hours or more will pass for this machine to complete the task, but if it does it without problems, that would be all I want, no matter how long it takes.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Worked on the old XEON server.
It took longer for the integration process and was way faster on the registration process, overall it was the fastest computer so far and APP likes it the most.

I had installed Gecko Linux on the AMD EPYC server, but the window manager crashed after 13 hours, so no result on this machine.

The M1 Ultra takes about 18 hours to compose these 121 images and outputs a faulty image with streaks, making it the slowest of the three for the job.

The Xeon took 12 hours to do this and produced a flawless image.

What I learned about APP from this:
APP does not support more than 32 CPU cores in Windows 10 Pro at the moment, probably a license limitation in the development environment?
More than 12 CPU cores slow down the registration process, which is the biggest task in making a big mosaic, this is probably due to the poor multithreading support of this process, which produces massive overhead in concurrent processes.
Apple's RISC processors are poor at computing extensive registration tasks, although Mabula has compiled native support for Apple Silicon. However, the M1 was the fastest of all in the integration process.
APP runs faster under Windows than under Linux.

So, I am done here, the parts from the EPYC server will be sold. Or is there a chance for APP for getting a patch for AMD EPYC processors??


This post was modified 2 years ago 3 times by Walter Leonhard Schramböck

   
ReplyQuote
(@imnewhere)
Black Hole
Joined: 9 years ago
Posts: 174
 

I use a 64 core EPYC with 256gb ram and mine blasts through mosaics with no java failures, but I am running Windows 11 Pro on it. It is faster than my dual Xeon 44 core/88 thread machine with 256gb ram running Windows 10 Pro, but both are fairly fast. The biggest mosaic so far for me

The Epyc machine uses a 4tb and a 2tb NVME SSD in the 2 onboard M2 slots, the dual Xeon uses a pair of 2tb NVME SSDs in the onboard M2 slots. Both machines have more of them on cards. Both have an RTX 4090 gpu.

The biggest I have so far has 24 panel each of SHO, and I am pulling in 7 more each of SHO, so the final stack will be 31 each of SHO for a total of 93 master lights reloaded as light frames.Once I have that all in I will see how it is handled.



   
ReplyQuote
(@walsc)
Neutron Star
Joined: 6 years ago
Posts: 132
Topic starter  

Posted by: @imnewhere

I use a 64 core EPYC with 256gb ram and mine blasts through mosaics with no java failures, but I am running Windows 11 Pro on it. It is faster than my dual Xeon 44 core/88 thread machine with 256gb ram running Windows 10 Pro, but both are fairly fast. The biggest mosaic so far for me

The Epyc machine uses a 4tb and a 2tb NVME SSD in the 2 onboard M2 slots, the dual Xeon uses a pair of 2tb NVME SSDs in the onboard M2 slots. Both machines have more of them on cards. Both have an RTX 4090 gpu.

The biggest I have so far has 24 panel each of SHO, and I am pulling in 7 more each of SHO, so the final stack will be 31 each of SHO for a total of 93 master lights reloaded as light frames.Once I have that all in I will see how it is handled.

This is interesting. Which EPYCmodel are you using?

I have the EPYC 7551, a first generation EPYC.
Two motherboards were used: Supermicro H11Dsi Rev 1.01 and also the H11Dsi-NT Rev 2.0.
16x 32 GB reg. ECC 2666, no graphics card - only onboard VGA. I also tried it with a NVidia 1050 with same results.
I swapped one CPU because of scratches on the contact-pins.
I tried one-cpu configuration with 256GB and two-cpu configuration with 512GB.
I tried swapping DIMMs.
I tried two Linux-distributions (Linux Mint and Gecko Linux) and Windows 10Pro.
I tried installing the OS on a 1TB NVMe and the Working directory on another 4TB NVMe.
I tried to install everything on only the 4TB NVMe.

It is strange that this machine managed to put out a result once but then never again.

Very hard to tell what is the cause for thie trouble.

 


This post was modified 2 years ago 2 times by Walter Leonhard Schramböck

   
ReplyQuote
Page 1 / 2
Share: