malloc crash
Mark Geisert
mark@maxrnd.com
Tue Oct 26 08:30:13 GMT 2021
Replying to myself to correct something I wrote...
Mark Geisert wrote:
> Takashi Yano wrote:
>> On Mon, 25 Oct 2021 16:36:50 -0700
>> Mark Geisert wrote:
>>> Ken Brown wrote:
>>>> On 10/25/2021 5:29 PM, Mark Geisert wrote:
>>>>> Corinna Vinschen wrote:
>>>>>> On Oct 25 08:35, Ken Brown wrote:
>>>>>>> On 10/25/2021 4:59 AM, Corinna Vinschen wrote:
>>>>>>>> Has the thread already been started at this point?
>>>>>>>
>>>>>>> Yes, here's the backtrace of that thread:
>>>>>>>
>>>>>>> Thread 5 (Thread 9692.0x7c4c):
>>>>>>> #0 0x00000001801934f9 in sys_alloc (m=0x18036f860 <_gm_>, nb=1040) at
>>>>>>> ../../../../temp/winsup/cygwin/malloc.cc:4232
>>>>>>> #1 0x0000000180196b96 in dlmalloc (bytes=1024) at
>>>>>>> ../../../../temp/winsup/cygwin/malloc.cc:4669
>>>>>>> #2 0x00000001801993e1 in dlrealloc (oldmem=0x0, bytes=1024) at
>>>>>>> ../../../../temp/winsup/cygwin/malloc.cc:5187
>>>>>>> #3 0x00000001800e8eed in realloc (p=0x0, size=1024) at
>>>>>>> ../../../../temp/winsup/cygwin/malloc_wrapper.cc:73
>>>>>>
>>>>>> Er... huh? So both threads are in a malloc function? This shouldn't
>>>>>> have happened, given the clunky muto guarding malloc calls. This is
>>>>>> really strange. Why's the muto not working here?
>>>>>
>>>>> Is it possible both threads have executed malloc_init()?
>>>>> If so, the second one would reinit the muto.
>>>>
>>>> Or does the fifo_reader thread call a malloc function before the main thread has
>>>> called malloc_init()? This would presumably cause __malloc_lock() to fail, but
>>>> there's no error check.
>>>
>>> If there's a global constructor involved, that is known to happen. Constructors
>>> are run from dll_crt0_0(), before malloc_init() is called from dll_crt0_1(). See
>>> dcrt0.cc for the details.
>>
>> So how about moving malloc_init() call from dll_crt0_1() to dll_crl0_0()
>> so that malloc() can be called in fixup_after_fork/exec()?
>
> It appears simple, but this is a touchy area of code. The _0 and _1 are two
> separate phases of process startup. I'd want to hear Corinna's thoughts on this.
>
> I'd also like to verify somehow that this is the scenario Ken is hitting.
>
> When I was researching different mallocs for Cygwin I hit the constructor snag
> repeatedly. I did try delaying the constructor-running until after malloc_init().
> More problems. I did not try moving malloc_init() to before the constructor run.
Apologies; this was many months ago. What I did try was moving the malloc_init()
to before running the constructor chain, as Takashi suggested. That is what gave
me more problems. I don't recall what they were, but I reverted that attempt.
The "future malloc" build of Cygwin I'm running doesn't exhibit Ken's issue, as
far as I can tell. It has a specific fix to avoid the scenario I've been talking
about here, but I don't want to take us down that path unless we're sure Ken's
hitting that same scenario.
..mark
More information about the Cygwin-developers
mailing list