2023-01-19: APP 2.0.0-beta13 has been released !
!!! Big performance increase due to optimizations in integration !!!
and upgraded development platform to GraalVM 22.3 based on openJDK19
We are very close now to releasing APP 2.0.0 stable with a complete printable manual...
V1.081 Throwing NULL POINTER exception in Register, Normalize and/or Integrate
I made a run for a large number of files. Many Lum, HA and OIII. When I get to the Register/Normalize or Integrate operation, the application is throwing (i think it happens in any 3 of these operations). It will register the HA, but won't do the lum or the OIII (it will throw another time after clicking OK. I repeatedly re-tried the operation and then finally it worked. However, the subsequent integration are strange in that the files appear in the file list simply as integrations (see attached) and cannot be opened. In other words, the application becomes unstable and must be restarted.
It appears to be a random occurrence. It was solidly stuck, throwing the error repeatedly then decided to work at one point, another time it happened in the Normalize and another in Integrate. It is hard to reproduce and I do knot know what sequence of operations leads to this failure.
I have the same problem as jm. I processed 2 galaxies. The first was a multi session with LRGB and it worked fine. The next was a single session with HaLRGB and it worked until the Integration phase when I got numerous instances of the same error. Same errors as jm but in a different module. I have a fairly free day today so I'm going to play around a bit and see if I can isolate the problem a bit more.
Since the only place I found the errors was in the integration phase, I ran it again - more errors. So I cleared everything and started from scratch and it worked perfectly. Tried it again from scratch and it worked again. The only thing I noticed is that I didn't give a name to the failed run while I did with the successful ones. That certainly can't be it. I must have done something different in the failed run that I wasn't aware of doing. However, there were no errors in the calibration run and all runs just used the defaults so....I think there is a bug but can't figure how to reproduce it.
My problem is solid. My batch of lights simply cannot be processed. Clean start, even a reboot, no go.
hopefully someone will chime in soon as I’m dead in the water now. I can successfully process each session independently. That will give me a set of fits for each filter, but then I cannot combine them in rgb because of Different dimensions. So I’m stuck.
i might have to roll back to the preceding version.
Normalizing broadband (3-channel) data with narrowband (1-channel) isn’t possible (yet). This throws an exception. You need to split the channels for the broadband and then process all of them as 1-channel data, assigning R, G, B, Ha etc. to their own filters.
I also see you both have a lot of memory assigned to the OS (which is normal), but also give APP a lot of memory. When the total amount in the PC is limited, you may run into issues if APP is trying to use as much as possible and the OS protests against that. I would then advise to lower the amount APP can use and splitting up your data in smaller chunks to process and then combine those integrations together again.
splitting up the integrations creates .fits of slightly different dimensions which cannot be combined because app complains that the files do not match in size or bit depth. This is because dithering makes a stacked image of random dimensions as a result of shifting multiple images during integration
also, please note that the process of a large number if images like i am doing worked well in v1.079 Why it throwing uncaught errors in V1.081 is a mystery. Something has changed.
Just to put some more detail to what happened to my run. I had 437 lights consisting of LRGB. I originally said I had some Ha and I actually put in some Ha darks, flats and dark flats. However, there were no Ha lights. This run failed as I indicated above. I turned off the PC and went to bed. Next morning turned on PC and realized I had no Ha lights so I removed Ha flats etc. and ran again. This run worked. So, as a test I put back the Ha flats, darks and dark flats and ran again. This run also worked.
So, I checked my previous runs under 1.079. The previous run was about 400 lights consisting of LRGB and Ha. It worked fine. However, I can't find a run with more than 437 lights. So, under my current PC configuration - this could be my limit. The fact one run failed could just be because I ran it at the end of the day while the successful runs were run just after I restarted my PC. It is an older machine and maybe time to upgrade.
jm did you check your actual number of images and compare them with the number of images you could process in 1.079? 1.081 may require a little more processing power but do you see much difference between the sizes that each version can handle?
I'm currently reprocessing my lights now. I have 8 sessions with 336 lights. I'm using one bad pixel map and two dark frames (one for 60 sec for sessions 1 & 2, one for 180 sec for sessions 3,4,5,6,7,8)
I made a run for sessions 1,2,3,4 and 5. That worked.
Now I'm rerunning with session 6 added. I'm working the theory that manually created sessions cause an issue. I also dropped my memory usage from the max of 7GB to 6GB so see if this would promote stability.
Oh, as I was typing this, it just failed again!
Hmmmm, so lowering memory usage does not do anything. The message states that it is failing in the routine to sort the file list by time.
OK, I'm going to try to bundle all my lights into 1 giant session.
We'll see what happens.
Well, OK. Using one session worked. Using a single custom session worked also. I put my memory usage back to 7gb. Worked. I did not use a ready made Bad Pixel map, I let APP generate a new one from the Master Dark. I used two Master Darks also.
So I don't know, if I use multiple sessions with multiple darks and a bad pixel maps, it fails. One session seems to work.
I cannot easily reproduce the error by selectively turning things on and off, so I have to guess it is something with sessions or sessions and multiple darks.
It's weird. But for now a single massive session seems to work.
Well, I spoke too soon. It fails with a single session too. I ran a single session run with > 300 lights. everything ran fine, but at the end I only had one integration on the file list although there where 4 integrations made (one for each narrow-band filter). I also noticed that some integrations had not had any calibration frames applied. Weird, so I re-ran the integration and after the integration og my LUM, it failed again at the integration of the narrow-band filters.
This is bad.
Is Mabula around?
@jm I would re-run under 1.079. If it works than Mabula has something to work with. If it fails - well you look in a different direction. In my opinion this is the fastest way to find a solution.
I agree. So, I loaded V1.079 and it works perfectly.
This is clearly an issue with v1.081
Now, if we can only grab the attention of Mabula...
Yes, this is a nasty bug indeed... it is a concurrency bug that was introduced in 1.080, so that explains that 1.079 works without problems. This new bug is caused by this release note of 1.080:
IMPROVED/FIXED, UPDATING FRAME LIST PANEL AND SORTING OF FRAMES, updating of the frame list panel at the bottom of the user interfaces has been greatly improved. It is much faster now. If you load more than 500 frames, then after the star analysis step, the updating and sorting on the analytical result for all frames would start to take several seconds in the old APP version. With more than 1000 frames, it really becomes very slow ! This problem is now completely solved. The updating and sorting on analytical results is now almost instant, even with more than 1000 frames loaded and analysed.
I had to change and speedup the updates/sorting of the frame list panel for the situation that 100s of frames are loaded. Now this bug, since it is a concurrency bug, is hard to duplicate reliably because it would manifest at different times and different situations, random like... that is typical for concurrency bugs. With little frames loaded (less than 100) the bug would normally not be triggered. With 200+ frames the bug would happen a lot more often.
Now, I think I have fixed the bug robustly with the 1.082 release, which I released today. Please try and let me know if everything works fine now 😉
With this fix, I have processed 300 and 600 frames in both multi-channel & session modes without any issues and with a fast update/sorting of the frame list panel with so many frames.
I have tried v1.082, and it works well! As a developer myself, I can appreciate issues with concurrency and race conditions.
I will continue to test if there are anymore issues I will post them here.
Thank you for your help.
Oef, yes concurrency bugs are great...