Tasks crashing

Message boards : Number crunching : Tasks crashing
Message board moderation

To post messages, you must log in.

AuthorMessage
gemini8

Send message
Joined: 8 Oct 20
Posts: 6
Credit: 721,940
RAC: 0
Message 361 - Posted: 4 Jan 2023, 9:31:24 UTC

Hello.
Nice we have new work for the New Year, but some of it is flawed.
I get this stderr out on several tasks, and so do other users on the same tasks:
<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>

</stderr_txt>
]]>

Thank you for having a look at the issues.
- - - - - - - - - -
Greetings, Jens
ID: 361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gemini8

Send message
Joined: 8 Oct 20
Posts: 6
Credit: 721,940
RAC: 0
Message 362 - Posted: 7 Jan 2023, 12:31:54 UTC

New work and apparently no issues anymore.
Thx.
- - - - - - - - - -
Greetings, Jens
ID: 362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 28 Sep 20
Posts: 5
Credit: 485,014
RAC: 0
Message 363 - Posted: 13 Jan 2023, 8:48:17 UTC - in response to Message 362.  

New work didn't last, I didn't even see there was any...
ID: 363 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 13
Credit: 1,005,403
RAC: 0
Message 365 - Posted: 13 Jan 2023, 9:12:05 UTC

Got today morning a batch of Gaia_3 tasks, sadly all failed at 16% :(
ID: 365 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 13
Credit: 1,005,403
RAC: 0
Message 366 - Posted: 13 Jan 2023, 9:16:11 UTC
Last modified: 13 Jan 2023, 9:30:41 UTC

<message>
Disk usage limit exceeded</message>


Huh, interesting, seems Gaia broke my 10GB limit for BOINC files. I removed all limits and let's see what happens now.

EDIT:

Even with no limits, tasks failed around 15% mark, and Gaia was then using about 18GB of disk space (!). I'll suspend crunching new tasks until more is known.
ID: 366 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 75
Credit: 163,295
RAC: 0
Message 367 - Posted: 13 Jan 2023, 12:47:49 UTC - in response to Message 366.  

Hi, I'm trying to understand why the previous data needed 3h calculations and the current input data counts so quickly. Because a certain calculation time is assumed, the result file takes on a monstrous size. I have reduced the calculation time to 5 min, increased the allowed size of the result file.
I am very surprised by this situation and apologise for the problem. It was a new packet of data as usual, and here such a surprise. A review of the results will show if the calculations are correct, but there is nothing to indicate that they are wrong, just that you are counting incredibly fast this time.
ID: 367 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 13
Credit: 1,005,403
RAC: 0
Message 368 - Posted: 13 Jan 2023, 13:20:21 UTC - in response to Message 367.  

Hi, I'm trying to understand why the previous data needed 3h calculations and the current input data counts so quickly. Because a certain calculation time is assumed, the result file takes on a monstrous size. I have reduced the calculation time to 5 min, increased the allowed size of the result file.
I am very surprised by this situation and apologise for the problem. It was a new packet of data as usual, and here such a surprise. A review of the results will show if the calculations are correct, but there is nothing to indicate that they are wrong, just that you are counting incredibly fast this time.


In that 16% I haven't seen a change in calculation speed, based on how fast it was going, the estimated task time would be around 1.5-2h, so not really a change from the previous batches. At least for me.
ID: 368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 75
Credit: 163,295
RAC: 0
Message 369 - Posted: 13 Jan 2023, 13:29:54 UTC - in response to Message 368.  

The program randomizes the orbits based on the covariance matrix, then integrates the motion. It does this for 2h.
If the processor is fast it will draw 100 orbits, if it is slow it will draw 3 orbits.
And at this point, more than 1,000,000 orbits are being drawn.
I'm checking this data packet, because it seems to include bug....
ID: 369 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 13
Credit: 1,005,403
RAC: 0
Message 370 - Posted: 13 Jan 2023, 17:24:53 UTC

Still no luck, every task I try fails in a computation error a few minutes in.
ID: 370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Gaia01902USA

Send message
Joined: 6 Aug 21
Posts: 9
Credit: 150,524
RAC: 0
Message 376 - Posted: 26 Jan 2023, 23:22:53 UTC

Today 1/26 I'm getting Computation Error on any task I try, so I've stopped in case it's me.
I don't know how to navigate BOINC too much, but for what it's worth I see
Output file 8_... for task 8_... absent. That might just be because there was no actual computations.
Any task only runs a few seconds before it errors.
ID: 376 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 26 Feb 20
Posts: 17
Credit: 827,821
RAC: 0
Message 377 - Posted: 27 Jan 2023, 3:45:15 UTC - in response to Message 376.  

Today 1/26 I'm getting Computation Error on any task I try, so I've stopped in case it's me.
I don't know how to navigate BOINC too much, but for what it's worth I see
Output file 8_... for task 8_... absent. That might just be because there was no actual computations.
Any task only runs a few seconds before it errors.


Nope not you, it happened on several of my pc's and I too stopped getting new tasks, mine were all the version 8 tasks
ID: 377 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 75
Credit: 163,295
RAC: 0
Message 378 - Posted: 27 Jan 2023, 6:05:59 UTC - in response to Message 377.  

All computers return an error, I'm looking for the reason, thanks for the signal
ID: 378 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 13
Credit: 1,005,403
RAC: 0
Message 379 - Posted: 27 Jan 2023, 9:23:24 UTC
Last modified: 27 Jan 2023, 9:24:40 UTC

Hi!

I am no longer getting any calculation errors, yay! But around 20-30% of my tasks fail with "download failed". Can it be a problem on my side?

EDIT: Probably not, such tasks fail for everyone: http://150.254.66.104/gaiaathome/workunit.php?wuid=9753
ID: 379 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 75
Credit: 163,295
RAC: 0
Message 380 - Posted: 27 Jan 2023, 14:44:27 UTC - in response to Message 379.  

Hi,

I see. The inputs file are on system, I don't know what happen :(

I will check , when all task ended.
ID: 380 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 13
Credit: 1,005,403
RAC: 0
Message 381 - Posted: 30 Jan 2023, 14:01:47 UTC
Last modified: 30 Jan 2023, 14:02:52 UTC

And now the tasks are flying, no errors, very smooth, thank you for your work in making this happen panie profesorze! :)
ID: 381 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Tasks crashing

©2024 GAVIP-GC