Gauche > Archives > 2012/01/21

2012/01/21 01:10:25 UTCKirill
#
Hi!
2012/01/21 01:10:30 UTCshiro
#
hi kirill
2012/01/21 01:10:38 UTCKirill
#
finally I'm able to log on here.
#
so as mentioned earlier, Gauche seems to be crashing pretty hard when running multiple independent scripts with multiple threads
#
I'm not sure if it's relevant, but it always crashes in Scm_Require
#
any ideas about what I can look for?
2012/01/21 01:12:06 UTCshiro
#
hm, Scm_Require does some stuff with threads.
2012/01/21 01:12:26 UTCKirill
#
specifically it crashes in the line immediately after /* `Autoprovide' feature */
#
if (SCM_NULLP(SCM_CDDR(p))
        && SCM_FALSEP(Scm_Member(feature, ldinfo.provided, SCM_CMP_EQUAL))) {
        ldinfo.provided = Scm_Cons(feature, ldinfo.provided);
    }
2012/01/21 01:13:01 UTCshiro
#
That seems a big clue. Let me see...
2012/01/21 01:13:10 UTCKirill
#
apparently evaluating the condition in the "if" statement is what crashes it.
2012/01/21 01:14:04 UTCshiro
#
If SCM_CDR(p) isn't a pair, that will crash.
#
Can you insert SCM_ASSERT(SCM_PAIRP(SCM_CDR(p))) right before the check?
#
Well no, actually p can be SCM_FALSE.
#
SCM_ASSERT(SCM_PAIRP(p) && SCM_PAIRP(SCM_CDR(p)))
#
It smells like a bug that has finally surfaced by your usage pattern.
2012/01/21 01:17:00 UTCKirill
#
hurray
#
let me try that... should take a couple of minutes
#
the biggest thing I'm worried about is that the GC is doing something silly
#
but that's technically unlikely because the GC has way more users than Gauche =)
#
under what circumstances would "p" not be a list, anyway?
#
I guess if feature is not found in the alist?
2012/01/21 01:20:47 UTCshiro
#
apparently I thought p is always a list (that is, the module is already registered in providing list). but it involves multiple threads, so there can be a scenario that's not.
2012/01/21 01:21:29 UTCKirill
#
can you provide an example scenario where the module isn't already registered?
2012/01/21 01:22:07 UTCshiro
#
I'm reading the code to remember what I was thinking when I wrote it.
2012/01/21 01:23:36 UTCKirill
#
okay, the assertion fails
2012/01/21 01:26:11 UTCshiro
#
good. so there's something I did wrong manipulating providing list.
2012/01/21 01:27:43 UTCKirill
#
can you talk about the logic the way it's done right now? I can look myself, but it will take longer
2012/01/21 01:29:23 UTCshiro
#
It is explained in the comment before Scm_Require. I don't think I can describe it more clearly than that.
#
Could you run the code with the following stub right before p = Scm_Assoc(feature, ldinfo.providing, SCM_CMP_EQUAL)?
#
Scm_Printf(SCM_CURERR, "requiring %S, providing %S\n",
               feture ldinfo.providing);
#
Oops, second line should read "feature, ldinfo.providing);"
#
It is to confirm that the feature indeed isn't in providing list, not that providing list is broken somehow.
2012/01/21 01:34:18 UTCKirill
#
sure, sec
2012/01/21 01:40:03 UTCKirill
#
ok so... it's requiring some feature that's not in the providing list
#
I'm assuming this just follows from the previous discussion
2012/01/21 01:40:52 UTCshiro
#
Thanks. I wanted to eliminate the possibility that GC wasn't looking ldinfo.
2012/01/21 01:41:36 UTCKirill
#
np
#
how do features get removed from the ldinfo.providing?
#
ah sorry, right next to that
#
so is this a providing list management error, or a legitimate case where there's still something to do if providing doesn't contain what we're requiring?
2012/01/21 01:44:10 UTCshiro
#
It should be an error, an unexpected situation. The feature is chained to providing around line 995, and if that isn't executed, then Scm_Require should bail out before reaching the autoprovide thing.
2012/01/21 01:44:50 UTCKirill
#
should it bail out with "load failed", or ..?
#
or you mean, it should bail out with "no work to do"?
#
I see what's happening, maybe
2012/01/21 01:46:40 UTCshiro
#
"no work to do", or "a loop is detected".
2012/01/21 01:47:16 UTCKirill
#
is it possible that the providing list changes between the loop check and the autoprovide thing?
2012/01/21 01:49:32 UTCshiro
#
Yes, it is!
2012/01/21 01:49:53 UTCKirill
#
and is that bad? =)
#
any thoughts?
2012/01/21 01:53:56 UTCshiro
#
Well, actually it's a bit hard to see how it happens. If providing list has been changed, provided list is also changed and other threads should return "no work to do".
#
Ah, no.
2012/01/21 01:54:46 UTCKirill
#
I'm afraid I don't understand the situation, then.
2012/01/21 01:55:11 UTCshiro
#
If thread A start loading it, thread B waiting it, then thread A fails. Thread A removes feature from providing list but does not add feature to provided list.
#
That's what's happening, isn't it?
2012/01/21 01:56:11 UTCKirill
#
you mean thread A is failing to load it? in that case, why should it add it to the provided list if the load failed?
2012/01/21 01:57:32 UTCshiro
#
No it shouldn't. But if that happens, after thread B is woken up, its neither loop nor provided case so it goes to load the feature, and when (somehow) thread B succeeds to load it this time, it reaches the autoprovided check while there's no entry of the feature in providing list.
#
Does the scenario sound plausible? In your settings is it possible that one thread somehow fails to load a feature, which another thread can load later?
2012/01/21 02:00:05 UTCKirill
#
I don't think so
#
the code is deterministic =) if something fails once, it'll always fail
#
I think it might be like this:
#
We have a bunch of threads running, some thread T is asking for feature X, which is in the providing list, but it's waiting on the mutex
#
but by the time thread T is awake, all the other threads have terminated, and thus the providing list doesn't contain the feature anymore
#
I have many short-lived threads. threads are created often, they do their little job, then they terminate; they do not linger in a pool or anything like that.
#
does this ring any bells?
2012/01/21 02:03:03 UTCshiro
#
You mean, all the other threads have terminated normally? If they're terminated normally, then they should've added the feature to provided list as well as removing from providing list, so T will return early with "no work to do".
2012/01/21 02:03:30 UTCKirill
#
yes, they terminated normally, but they were GC'd on termination; I'm not sure how that affects anything.
#
I mean, the garbage collection must have done something before they terminated, as it shoudl
#
should*
#
so the theory is that if I start a thread, it requires X, and then terminates, X will now be in the "provided" list forever?
2012/01/21 02:05:51 UTCshiro
#
GC shouldn't affect ldinfo which is a global data. Hmm... I might see something suspicious. If one of the feature does contain (provide "foo") form that suppresses autoprovide feature, it can happen...
2012/01/21 02:06:14 UTCKirill
#
yes, all the modules I wrote use (provide ...) explicitly
2012/01/21 02:06:23 UTCshiro
#
To answer your last question, yes, module loading/require/provide is global. Once it's loaded, it's permanent.
2012/01/21 02:06:23 UTCKirill
#
none of my code relies on autoprovide
#
can you explain further what how the error can happen if my module (which I do (use my-module)) uses (provide ...) explicitly?
2012/01/21 02:08:23 UTCshiro
#
Okay, there's my scenario. Autoprovide makes if you (require "foo") and foo.scm doesn't contain provide form, then it works as if there's (provide "foo").
2012/01/21 02:08:38 UTCKirill
#
this sounds dangerous but okay
2012/01/21 02:08:59 UTCshiro
#
But foo.scm contains something like (provide "bar"), then autoprovide is suppressed.
2012/01/21 02:09:07 UTCKirill
#
you mean, (provide bar)
2012/01/21 02:10:02 UTCshiro
#
No, Gauche's native provide works on strings.
2012/01/21 02:10:32 UTCKirill
#
ah sorry, I was thinking about export
#
so to correct what I was saying -- my modules use export and use
#
not require/provide
#
so I guess the scenario doesn't apply?
2012/01/21 02:11:23 UTCshiro
#
Ah, you don't have 'provide' form at all? Hmm..
2012/01/21 02:11:30 UTCKirill
#
nope
#
sorry for the confusion. I'm too used to Racket's require/provide
#
I'll print the "provided" list before the loop enters, to see what's going on
#
maybe it's due to the GC after all, who knows.
2012/01/21 02:17:41 UTCKirill
#
ok so let's talk about the logic inside the loop
2012/01/21 02:17:47 UTCshiro
#
ok
2012/01/21 02:17:52 UTCKirill
#
the idea is that if it's already provided, then we're good to go
2012/01/21 02:18:05 UTCshiro
#
yes.
2012/01/21 02:18:20 UTCKirill
#
if it's being provided (i.e loaded) by something else, we check for dependencies. let's pretend there's no loops -- in that case the feature gets added to the waiting list
#
... and then we wait for the load to finish and then remove it from the waiting list?
2012/01/21 02:19:39 UTCshiro
#
the waiting thread is woken up whenever providing list has changed. we need to check the condition again anyway, so we remove from waiting list. if the condition doesn't satisfy, we'll add to waiting list again.
2012/01/21 02:20:20 UTCKirill
#
I print feature + provided before the loop enters, and right before the crash:
#
requiring "srfi-13", provided ("nrj-ch" "soma" "nrj" "choco" "nrj-france" "sirius" "radium-tools" "sxml/ssax" "sxml/adaptor" "rfc/zlib" "rfc/md5" "util/digest" "rfc/http" "srfi-0" "gauche/interpolate" "gauche/partcont" "gauche/parameter" "gauche/hook" "gauche/mop/validator" "gauche/net" "gauche/modutil" "gauche/uvector" "util/queue" "rfc/uri" "rfc/822" "gauche/regexp" "gauche/condutil" "text/parse" "gauche/portutil" "gauche/experimental/lamb" "util/match" "srfi-14" "srfi-11" "rfc/cookie" "util/trie" "gauche/hashutil" "srfi-26" "srfi-19" "util/list" "gauche/sequence" "gauche/collection" "gauche/procedure" "gauche/common-macros" "gauche/charconv" "srfi-13" "srfi-1" "srfi-2" "srfi-6" "srfi-8" "srfi-10" "srfi-17")
#
note how srfi-13 is already in the provided list
#
so that's a little confusing.
2012/01/21 02:22:22 UTCshiro
#
weird. so that thread should return early from "no work to do" line, but it actually goes to the latter part, correct?
2012/01/21 02:22:35 UTCKirill
#
yes, that's the idea.
#
let me just double-check something...
#
does Scm_Printf accept regular format specifiers (e.g. c int)?
2012/01/21 02:24:13 UTCshiro
#
Yes. It's an extension to regular printf.
2012/01/21 02:24:19 UTCKirill
#
ok, just a second
2012/01/21 02:39:33 UTCKirill
#
still looking...
#
yep, it's requiring util/match, which is already in the list of provided, then crashes on assert
#
so to answer your question: definitely yes, that's what's happening.
#
any ideas as to what might be the case?
#
I should mention that the thread gets delayed significantly
#
i.e. the require message is in some spot in the log, then there's a lot of other thread activity, then this thread finally gets to the assert and dies
2012/01/21 02:43:05 UTCshiro
#
all right... this may sound silly, but could you check the return value of Scm_Member(feature, ldinfo.provided, ...) line? Just to make sure Scm_Member is working
2012/01/21 02:43:22 UTCKirill
#
=)
#
sure thing.
#
I would be surprised if it wasn't and we got this far, but I'll check
2012/01/21 02:44:03 UTCshiro
#
well, actually, if Scm_Member returns SCM_FALSE, let's print provided list again if there's any unexpected change.
2012/01/21 02:44:29 UTCKirill
#
so we want to print the list after if (!SCM_FALSEP(provided)) break;
#
?
2012/01/21 02:45:33 UTCshiro
#
I meant, just after provided = Scm_Member(...), print the value of variable provided and ldinfo.provided.
2012/01/21 02:45:56 UTCKirill
#
ok
#
PS: is there any reason why make-install (or framework.sh invocation) renders the source tree invalid?
#
e.g. running framework.sh twice is not possible
#
I have to clean and remake
2012/01/21 02:49:45 UTCshiro
#
I think that's simple overlook. Depending on the platform, make install need to relink the binary, which may interfere with building the source again.
2012/01/21 02:50:10 UTCKirill
#
relink the binary because of dynamic ld?
2012/01/21 02:50:52 UTCshiro
#
Haven't touched framework for long time so I'm not sure where the problem is, but on some platforms it is needed to fix rpath. I'm not sure if it applies to framework build, though.
2012/01/21 02:51:26 UTCKirill
#
ok so Scm_member returns the right thing
2012/01/21 02:52:09 UTCshiro
#
Ugh... and it still pass through if (!SCM_FALSEP(provided)) return 0; /* no work to do */ line?
2012/01/21 02:52:34 UTCKirill
#
let me check..
#
at this point I'm fairly sure it's some GC crap
#
magic doesn't happen in this way.
#
rebuilding... =)
2012/01/21 02:55:03 UTCshiro
#
GC code is mature, but using GC code properly is tricky and "GC problems" is usually the improper use of GC. btw, the framework problem, I guess if we move OSX build to path-independent way by default (that is, do not embed --prefix path but search from where libgauche is), the build problem may be solved. Just a hunch.
2012/01/21 02:55:18 UTCKirill
#
while we're building..
#
is it possible that inside finalizable(), the vm is 0?
#
void finalizable(void)
{
    ScmVM *vm = Scm_VM();
    vm->finalizerPending = TRUE;
    vm->attentionRequest = TRUE;
}
#
here
#
... because I had at least one crash where vm was 0.
2012/01/21 02:57:45 UTCshiro
#
Well... yes it is. Good catch. I believe you can just guard the assignments by if (vm != NULL) { ... }.
#
I'll just fix it now.
2012/01/21 02:59:28 UTCKirill
#
yeah I fixed it already
#
but I wasn't sure if I was fixing something that was a result of another problem =)
2012/01/21 03:01:31 UTCshiro
#
It can happen when GC is called before Gauche runtime is fully initialized. It is a legitimate scenario, and if vm isn't initialized yet, there won't be a task that requires vm to cleanup. So it's safe, I believe.
2012/01/21 03:01:56 UTCKirill
#
ok
#
as long as this is some "high level" problem I'm happy
#
I hope it's not some kind of crazy memory corruption issue
#
because that would mean that gauche has deeper issues..
#
(I mean, the issue we're seeing with require, etc.)
2012/01/21 03:04:00 UTCshiro
#
Yeah, that's what I'm afraid of. Unreliable memory management is the worst nightmare.
2012/01/21 03:05:46 UTCKirill
#
hmm ok
#
so what's happening is this...
#
it's requiring some module
#
the module isn't in provided, so it loads it -- ok
#
while it's loading, it requires a bunch of other modules
#
... because the original module it's trying to load does (use ...) inside
#
and I guess after it's done loading all the internal (use) of that module, it comes back to the assert, and fails
#
I'm assuming that the call to Scm_Load there is recursive, right?
2012/01/21 03:07:36 UTCshiro
#
Yes. Such nested load is normal.
2012/01/21 03:07:46 UTCKirill
#
hmm
#
but the issue is that I have other scripts that (in parallel) try to load the module
#
so once it gets outside of the mutex, providing now has the module
#
and by the time Scm_Load is done, providing should _still_ have the module
#
but it doesn't =(
#
is that correct? if I load a module X, once it's out of the mutex and calls Scm_Load, X is in "providing"
#
so everyone else that requires it will be blocked on the condition variable
#
and then once the Scm_Load() returns, X should still be in "providing". is that right?
2012/01/21 03:10:31 UTCshiro
#
Yes, that's the way it's supposed to.
2012/01/21 03:10:48 UTCKirill
#
but for whatever reason it's not happening... I'll dig around a little bit more.
#
I'm a little bit surprised that this isn't a common use case
#
the Lua programming language was designed specifically to execute small snippets, possible from many threads
2012/01/21 03:12:39 UTCshiro
#
One assumption is that the thread who loads the feature is the only one who adds/removes the feature to/from the providing list. But that assumption isn't checked explicitly. If we can somehow check that assumption...
2012/01/21 03:13:02 UTCKirill
#
but how could it be that another thread adds/removes it from the providing list if everyone else is waiting?
#
for example, it should _not_ be possible that a module is listed twice in the providing list, right?
2012/01/21 03:14:10 UTCshiro
#
The parallel load case has been tested in production. There should be something that has been missed in previous use cases. And to your question, right, it seems impossible. We're witnessing an impossible!
2012/01/21 03:17:30 UTCKirill
#
right
#
there are at least 3 instances of the module in the providing list
#
=D
2012/01/21 03:19:22 UTCshiro
#
Ah, that's a clue. So my logic is flawed somewhere. I'd like to try to reproduce it here. Are you using Gauche in the way you mentioned before, that is, there's already a bunch of thread running and you attach Gauche runtime to it? Or are you just using plain 'gosh'?
2012/01/21 03:19:37 UTCKirill
#
no no, it's dynamically linked and I attach the runtime
#
but please let me look at it for a few more minutes
#
it is possible that in all of this, I am the idiot.
#
(wouldn't be the first time)
#
for now, all the things that work in gosh are your fault, and all the things that don't work are mine =)
#
ok what happens when...
#
we are interested in module X
#
and we are interested in module Y
#
module Y requires module X
#
so what happens when we are loading module Y, it calls Scm_Load
#
and then another thread wants to load module X?
#
(directly)
#
hmm that should still work, though.
2012/01/21 03:23:12 UTCshiro
#
Assuming both thread uses 'use' (or underlying 'require'), only one of them actually loads it.
2012/01/21 03:23:20 UTCKirill
#
right
#
ok, still looking
#
ok so is there any reason why there's a while(0) and a continue right before?
#
if my brain is still working properly, the loop inside require is equivalent to:
#
do
  {
     printf("hi\n");
     continue;
  } while (0);
#
do you agree?
#
... because if you run that, "hi" will only be printed once, since condition is checked after "continue"
#
so perhaps this is the bug.
#
technically it's supposed to loop around and check again if it's in the provided list, etc., but as it is now, it only runs the loop once
#
again please feel free to tell me if I'm full of crap =)
2012/01/21 03:31:53 UTCshiro
#
Oops, I think idiot is me.
#
It looks like you can remove continue and change it to while(TRUE) should be the way to go.
2012/01/21 03:35:52 UTCKirill
#
right
#
I mean, right about removing continue
#
generally I'd just replace the loop with for (;;) { ... }
#
or equivalent
#
since break's are taken care of
#
the continue at the bottom is not necessary, then.
2012/01/21 03:37:35 UTCshiro
#
Yeah, for (;;) { ... } is fine. Now I wonder what I was thinking when I wrote the current code. And I guess I was just lucky that it has been working in this way.
2012/01/21 03:37:57 UTCKirill
#
yeah, presumably because in most cases one pass through it was enough
#
the semantics of while/for loops can be tricky to remember sometimes, especially when break/continue are involved
#
the good news is that the module loading logic is fine =)
#
so all in all, this bug is nothing. you got the hard stuff right.
2012/01/21 03:39:54 UTCshiro
#
Is it working? I'll commit the fix then. And thanks a lot for tracking down this!
2012/01/21 03:40:02 UTCKirill
#
sec, rebuilding...
#
wow, amazing
#
first pass through a multi-script run without a crash
#
let's pretend it's fixed.
#
let me know when you've pushed this (and the core.c fix), so I can pull and continue =)
2012/01/21 03:44:02 UTCshiro
#
Pushed. (119157a)
2012/01/21 03:46:04 UTCKirill
#
thanks
#
well, I think I'm done for tonight. it's almost 11PM here
#
I'll let you know if I run into anything else =)
#
thanks for your patience!
2012/01/21 03:48:10 UTCshiro
#
Glad we made it work. Thanks!
2012/01/21 09:45:30 UTCshiro
#
Windows/MinGWでのスレッドサポートをコミットしました。./configure --enable-threads=win32 で有効になります。pthreads-win32ではなく直接Win32 APIのスレッドを使います。興味ある方は試してみてください。thread-terminate!回りはもう一工夫する予定。
2012/01/21 10:23:37 UTCkenhys
#
MinGWビルドの修正が入ったとのことなので試してみましたが、前と状況が変わりませんでした。http://pastebin.com/dXDSP7xr
2012/01/21 10:27:09 UTCshiro
#
むー。うちの環境だと動くのだけれど何が違うんだろう。
2012/01/21 10:36:23 UTCshiro
#
src以下のgoshを直接機動
#
直接起動 ( ./gosh -ftest ) した場合はどうなりますか?
2012/01/21 10:40:50 UTCkenhys
#
ついさっきビルドしなおそうとチェックアウトしなおしてクリーンアップしてしまったところなので、再ビルド後に試してみます。
2012/01/21 11:13:25 UTCkenhys
#
再度試してみたところ、ビルドが通りました。http://pastebin.com/P6HvSt9G
2012/01/21 11:15:45 UTCshiro
#
おおよかった。
2012/01/21 11:15:52 UTCkenhys
#
make checkの結果はこんな感じです。 http://pastebin.com/2eZJHWb2
2012/01/21 11:19:26 UTCshiro
#
あれ、make checkうちでは全部通るのだけれど… いくつかのテスト失敗は行末がCRかCRLFかって違いっぽいですね。MinGWで行末の扱いをグローバルに変えるセッティングってありましたっけ。
2012/01/21 11:29:58 UTCkenhys
#
MinGWについてはわからないですが、GitでチェックアウトするときにLFでないとgauche/config.hの生成に失敗するのでautocrlfを無効にしているくらいです。
2012/01/21 11:39:17 UTC齊藤
#
git config core.autocrlf false ですね。 それは私もひっかかって困ったことがありました。
#
MinGW での行末の扱いはテキストモードかバイナリモードかだけで決まるはずだったと思います。
2012/01/21 11:44:50 UTCshiro
#
お、すると私の設定の方が何かおかしいんかな。いや、autocrlf=falseになってるなあ。
2012/01/21 11:58:18 UTCshiro
#
kenhysさんのログを見る限り、Gaucheは\nを出力してるつもりが\r\nになっちゃってるっぽい。で、Gaucheではポートオープン時に:element-type :characterと明示しない限りはバイナリモードでオープンするし、標準入出力も初期化時にバイナリモードにしてるんだけど、それが何故か効いてないのかな?
2012/01/21 12:06:26 UTC齊藤
#
私の環境で HEAD をビルドしてみましたが、そのテストはパスしますね…。
2012/01/21 12:07:46 UTCshiro
#
他のテストはこけますか?
2012/01/21 12:08:49 UTC齊藤
#
ndbm 関連でビルドがうまいこといかない感じになってます。
#
ログ収集中…
2012/01/21 12:10:51 UTCshiro
#
ああ、gdbmは私の環境では使ってないんで表面化してないのだな。gdbmってmingwやmsysでは配ってなかったと思いますが、ソースからビルドですか? gdbmだけうまくいっててndbmでこけてるなら、ndbm互換ライブラリを探すところで失敗してるとか?
2012/01/21 12:11:39 UTC齊藤
#
中間ファイル (?) のパーミッションがどうのというメッセージみたいです。
2012/01/21 12:14:12 UTCshiro
#
そのファイルが何だかわかりますか?
2012/01/21 12:16:40 UTC齊藤
#
エラーはエラー出力に出てたみたいで、標準出力だけリダイレクトしたらそのメッセージが残ってなかった…。
#
とりあえずコンソールからコピペ
#
../../src/gosh -ftest ../../src/precomp -e -o dbm--ndbm ndbm.scm
../../src/gosh -ftest ./ndbm-suffixes.scm ndbm-suffixes.h
*** SYSTEM-ERROR: unlink failed on ndbmtest4f1aab: Permission denied
Stack Trace:
_______________________________________
  0  (sys-unlink tname)
        At line 13 of "././ndbm-suffixes.scm"
make[2]: *** [ndbm-suffixes.h] Error 70
2012/01/21 12:20:24 UTCshiro
#
openしてるファイルがunlinkできないって罠にはまってるような感じ。
#
ああ、そりゃそうだな。
#
--- a/ext/dbm/ndbm-suffixes.scm
+++ b/ext/dbm/ndbm-suffixes.scm
@@ -9,7 +9,8 @@
 (define (main args)
   (match (cdr args)
     [(file)
-     (receive (_ tname) (sys-mkstemp "ndbmtest")
+     (receive (p tname) (sys-mkstemp "ndbmtest")
+       (close-output-port p)
        (sys-unlink tname)
        (let1 p (run-process `("./ndbm-makedb" ,tname) :wait #t)
          (unless (zero? (process-exit-status p))
#
こうするとどうですか。
2012/01/21 12:24:43 UTC齊藤
#
なんだろうこれ。
#
[Window Title]
ndbm-makedb.exe

[Main Instruction]
ndbm-makedb.exe は動作を停止しました

[Content]
問題が発生したため、プログラムが正しく動作しなくなりました。プログラムは閉じられ、解決策がある場合は Windows から通知されます。

[プログラムの終了(C)]
2012/01/21 12:27:19 UTCshiro
#
それはSEGVとかBus Error的ななにかだと思う。(1) ndbm-makedb.exeを直接起動(引数なし)するとUsageを表示しますか? (2)ndbm-makedb.exe foo のように引数を与えると?
2012/01/21 12:27:49 UTC齊藤
#
Usage: ndbm-makedb <dbname> と表示します。
2012/01/21 12:28:49 UTCshiro
#
引数を与えた場合は?
2012/01/21 12:29:06 UTC齊藤
#
ndbm-makedb.exe foo だと同様のエラーを出して停止します。 一時ファイルを生成できてないってことなのかな?
#
ndbm-makedb の実行前に unlink しちゃっていいんですか?
2012/01/21 12:30:58 UTCshiro
#
そこはCから直接ndbmのAPIを呼んで、ただ引数に与えられた名前でndbmデータベースをopenするだけなんです。
2012/01/21 12:31:36 UTC齊藤
#
名前を生成するだけでそのファイルを使うわけじゃないってことですね。
2012/01/21 12:32:14 UTCshiro
#
そうです。その後、ndbm-makedbでデータベースを作ってみて、実際にどんなファイルが作られたか調べます。
#
ndbm-makedbがこけるってことはGaucheとは関係なくて、ndbmライブラリの使い方が悪いか、ndbm-makedbのビルドがうまくいってないか、だと思います。
#
gdbがあればgdb経由で実行してみるともう少し手がかりがつかめるかも。
2012/01/21 12:38:58 UTC齊藤
#
(gdb) r foo
Starting program: c:\home\Gauche\ext\dbm/ndbm-makedb.exe foo
[New Thread 4372.0x6d0]

Program received signal SIGSEGV, Segmentation fault.
0x6085db86 in strsep () from c:\Mingw\msys\1.0\bin\msys-1.0.dll
2012/01/21 12:39:51 UTCkenhys
#
些細な話なんですが、mingw-dist.shにてmakeのステータスのチェックがあると良いかと思います。--- C:/Users/khayashi/AppData/Local/Temp/min319F.tmp/mingw-dist-HEAD-left.sh	Sat Jan 21 21:37:27 2012
+++ C:/MinGW/msys/1.0/home/khayashi/Project/gauche/Gauche-mingw-w32thread/src/mingw-dist.sh	Sat Jan 21 21:21:03 2012
@@ -50,8 +50,13 @@
   distdir=`pwd`/../Gauche-mingw-dist/Gauche
 fi  
 rm -rf $distdir
-./configure --enable-multibyte=utf8 --prefix=$distdir
+./configure --enable-multibyte=utf8 --enable-threads=win32 --prefix=$distdir
 make
+if [ $? -ne 0 ]; then
+	echo "failed to build gauche."
+	exit
+fi
+
 
 # prepare precompiled directory tree.
 make install
2012/01/21 12:40:17 UTCshiro
#
そこでwhereとすると? > 齊藤
#
確かに。 > kenhys
2012/01/21 12:40:38 UTC齊藤
#
(gdb) where
#0  0x6085db86 in strsep () from c:\Mingw\msys\1.0\bin\msys-1.0.dll
#1  0x6082d767 in msys-1!calloc () from c:\Mingw\msys\1.0\bin\msys-1.0.dll
#2  0x6088d0d1 in strftime () from c:\Mingw\msys\1.0\bin\msys-1.0.dll
#3  0x6082d460 in msys-1!malloc () from c:\Mingw\msys\1.0\bin\msys-1.0.dll
#4  0x6a6c13df in msys-gdbm_compat-3!dbm_open ()
   from c:\Mingw\msys\1.0\bin\msys-gdbm_compat-3.dll
#5  0x004013f3 in main (argc=2, argv=0x5d17d8) at ndbm-makedb.c:39
2012/01/21 12:42:09 UTCshiro
#
むー、dbm_openの使い方間違えてるんでなければ、gdbm側で妙なことになってるっぽい気が…
2012/01/21 12:42:51 UTC齊藤
#
ndbm-makedb.c は小さいものですし、問題が発生する余地は無さげですね。
2012/01/21 14:09:10 UTC齊藤
#
スタティックリンクするようにしたら落ちることはなくなったんですが、
#
dbm_open failed for foo: No such file or directory
#
と出てオープンに失敗しました。
#
gdbm ライブラリの中で使っている link 関数に問題があるように思われます。
#
Windows には相当する機能が無いので msys-1.0.dll でエミュレーションしているようなんですが、ただのダミーっぽい感じ。
#
と、言うわけで Gauche の問題ではないと思います。
2012/01/21 14:25:36 UTCkenhys
#
--enable-threads=win32でのビルドを試してみたのですが、こっちはlibgauche-0.9.dllのビルドで失敗しました。
#
http://pastebin.com/HFk4siBt
#
どうも手元の環境ではgc/win32_threads.oがうまくコンパイルされていないっぽい。
#
$ nm gc/win32_threads.o
00000000 b .bss
00000000 d .data
00000000 N .debug_abbrev
00000000 N .debug_info
00000000 N .debug_line
00000000 t .text
2012/01/21 14:39:06 UTC齊藤
#
インストールしてある Gauche は 0.9.2 ですか?
2012/01/21 14:41:08 UTCkenhys
#
0.9.2です。
2012/01/21 15:21:45 UTC齊藤
#
gauche-thread-type が none を返しているようですが、これは修正した方が良さそうですね。
2012/01/21 15:29:14 UTC齊藤
#
kenhys さんの環境と何が違うんだろう。
2012/01/21 21:56:33 UTCshiro
#
CSSでハイフンを識別子に含められるようになったのってそもそもなんでだろう。DSSSLあたりが関係してるのかな? http://uupaa.hatenablog.com/entry/2012/01/22/013509
#
Lisp界隈では変数名の区切りはずっとハイフンだし、エディタもそれに対応してるわけで。