Prolog S-expression 토크 나이저가 기본 케이스에서 실패하는 이유는 무엇입니까?

debugcn 에 게시 Dev

Caspian Ahlberg

Prolog (저는 GNU Prolog를 사용하고 있습니다)를 배우고 파싱 능력을 익히기 위해 Lisp (또는 정확하다면 S 표현) 토크 나이저를 작성하여 시작 ['(', 'f', 'o', 'o', ')']합니다 ['(', 'foo', ')']. 예상대로 작동하지 않습니다. 그래서 여기에 있습니다! 내 생각 과정이 의사 코드에서 빛나고 있다고 생각했습니다.

tokenize([current | rest], buffer, tokens):
    if current is '(' or ')',
        Tokenize the rest,
        And the output will be the current token buffer,
        Plus the parenthesis and the rest.

    if current is ' ',
        Tokenize the rest with a clean buffer,
        And the output will be the buffer plus the rest.
    
    if the tail is empty,
        The output will be a one-element list containing the buffer.
    
    otherwise,
        Add the current character to the buffer,
        And the output will be the rest tokenized, with a bigger buffer.

나는 그것을 다음과 같이 Prolog로 번역했습니다.

tokenize([Char | Chars], Buffer, Tokens) :-
    ((Char = '(' ; Char = ')') ->
        tokenize(Chars, '', Tail_Tokens),
        Tokens is [Buffer, Char | Tail_Tokens];
    Char = ' ' ->
        tokenize(Chars, '', Tail_Tokens),
        Tokens is [Buffer | Tail_Tokens];

    Chars = [] -> Tokens is [Buffer];

    atom_concat(Buffer, Char, New_Buffer),
    tokenize(Chars, New_Buffer, Tokens)).

print_tokens([]) :- write('.').
print_tokens([T | N]) :- write(T), write(', '), print_tokens(N).

main :-
    % tokenize(['(', 'f', 'o', 'o', '(', 'b', 'a', 'r', ')', 'b', 'a', 'z', ')'], '', Tokens),
    tokenize(['(', 'f', 'o', 'o', ')'], '', Tokens),
    print_tokens(Tokens).

결과를 실행하는 경우, 아래, 다음과 같이 : gprolog --consult-file lisp_parser.pl그냥 나에게 말한다 no. 나는 추적 main했고 그것은 나에게 아래 스택 추적을 주었다. tokenize빈 케이스로 인해 실패 하는 이유를 이해할 수 없습니다 . 이전으로 지워졌 기 때문에 버퍼가 비어 있음을 ')'알지만 Tokens해당 시점에서 비어 있어도 Tokens더 큰 결과가 재귀 적으로 누적 되지 않습니까? Prolog에 능숙한 사람이 여기에 몇 가지 팁을 줄 수 있습니까?

| ?- main.

no
| ?- trace.
The debugger will first creep -- showing everything (trace)

(1 ms) yes
{trace}
| ?- main.
      1    1  Call: main ? 
      2    2  Call: tokenize(['(',f,o,o,')'],'',_353) ? 
      3    3  Call: tokenize([f,o,o,')'],'',_378) ? 
      4    4  Call: atom_concat('',f,_403) ? 
      4    4  Exit: atom_concat('',f,f) ? 
      5    4  Call: tokenize([o,o,')'],f,_429) ? 
      6    5  Call: atom_concat(f,o,_454) ? 
      6    5  Exit: atom_concat(f,o,fo) ? 
      7    5  Call: tokenize([o,')'],fo,_480) ? 
      8    6  Call: atom_concat(fo,o,_505) ? 
      8    6  Exit: atom_concat(fo,o,foo) ? 
      9    6  Call: tokenize([')'],foo,_531) ? 
     10    7  Call: tokenize([],'',_556) ? 
     10    7  Fail: tokenize([],'',_544) ? 
      9    6  Fail: tokenize([')'],foo,_519) ? 
      7    5  Fail: tokenize([o,')'],fo,_468) ? 
      5    4  Fail: tokenize([o,o,')'],f,_417) ? 
      3    3  Fail: tokenize([f,o,o,')'],'',_366) ? 
      2    2  Fail: tokenize(['(',f,o,o,')'],'',_341) ? 
      1    1  Fail: main ? 

(1 ms) no
{trace}
| ?-

데이비드 톤 호퍼

이것은 어떤가요. 나는 그것이 당신이 원하는 것이라고 생각하지만, Definite Clause Grammars (단지 horn 절로 :-대체 된 혼 절과 -->입력 문자 목록과 나머지 문자 목록을 포함하는 두 개의 생략 된 인수입니다. DCG 규칙의 예 :

rule(X) --> [c], another_rule(X), {predicate(X)}.

목록 처리 규칙 rule//1은 다음과 같습니다. c입력 목록에서 문자를 찾은 다음로 목록 처리를 계속 another_rule//1하고 문제가 해결되면 predicate(X)정상적으로 호출하십시오 .

그때:

% If we encounter a separator symbol '(' or ')', we commit to the
% clause using '!' (no point trying anything else, in particular
% not the clause for "other characters", tokenize the rest of the list,
% and when we have done that decide whether 'MaybeToken', which is 
% "part of the leftmost token after '(' or ')'", should be retained.
% it is dropped if it is empty. The caller is then given an empty
% "part of the leftmost token" and the list of tokens, with '(' or ')'
% prepended: "tokenize('', [ '(' | MoreTokens] )  -->"
 
tokenize('', [ '(' | MoreTokens] ) -->
   ['('],
   !,
   tokenize(MaybeToken,Tokens),
   {drop_empty(MaybeToken,Tokens,MoreTokens)}.
   
tokenize('',[')'|MoreTokens]) --> 
   [')'],
   !,
   tokenize(MaybeToken,Tokens),
   {drop_empty(MaybeToken,Tokens,MoreTokens)}.
   
% No more characters in the input list (that's what '--> []' says).
% We succeed, with an empty token list and an empty buffer fro the
% leftmost token.

tokenize('',[]) --> [].

% If we find a 'Ch' that is not '(' or ')', then tokenize
% more of the list via 'tokenize(MaybeToken,Tokens)'. On
% returns 'MaybeToken' is a piece of the leftmost token found
% in that list, so we have to stick 'Ch' onto its start.

tokenize(LargerMaybeToken,Tokens) --> 
   [Ch],
   tokenize(MaybeToken,Tokens),
   {atom_concat(Ch,MaybeToken,LargerMaybeToken)}.

% ---
% This drops an empty "MaybeToken". If "MaybeToken" is 
% *not* empty, it is actually a token and prepended to the list "Tokens"
% ---

drop_empty('',Tokens,Tokens) :- !.
drop_empty(MaybeToken,Tokens,[MaybeToken|Tokens]).

% -----------------
% Call the DCG using phrase/2
% -----------------

tokenize(Text,Result) :-
   phrase( tokenize(MaybeToken,Tokens), Text ),
   drop_empty(MaybeToken,Tokens,Result),!.

그래서 :

?- tokenize([h,e,l,l,o],R).
R = [hello].

?- tokenize([h,e,l,'(',l,')',o],R).
R = [hel,(,l,),o].

?- tokenize([h,e,l,'(',l,l,')',o],R).
R = [hel,(,ll,),o].

GNU Prolog에서는 'hello'표기가 [h,e,l,l,o]직접 생성된다고 생각 합니다.

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정2021-04-5

몇 마디 만하겠습니다

0리뷰

로그인참여 후 검토

Related 관련 기사

기사

Prolog S-expression 토크 나이저가 기본 케이스에서 실패하는 이유는 무엇입니까?

Prolog S-expression 토크 나이저가 기본 케이스에서 실패하는 이유는 무엇입니까?

중첩 된 디렉토리에 대해 기본 setfacl이 실패하는 이유는 무엇입니까?

유닉스에서 디렉토리 크기가 항상 4096 바이트 인 이유는 무엇입니까?

이 아주 기본적인 "Hello, World!"에서 "make distcheck"가 실패하는 이유는 무엇입니까? autotools 예?

내 '대기 방법'이 TestNG 테스트 케이스에 실패하지 않는 이유는 무엇입니까?

스크롤바가있는 CSS에서 패딩의 BOTTOM이 실패하는 이유는 무엇입니까?

컨테이너 크기가 10M 인 경우 cryptsetup이 실패하는 이유는 무엇입니까?

내 OCaml S-expression 파서가 실패하는 원인은 무엇입니까?

이 스크립트가 실패하는 이유는 무엇입니까?

bash 스크립트를 기반으로 한 init.d가 실패하는 이유는 무엇입니까?

다음 쉘 스크립트 기능이 실패하는 이유는 무엇입니까?

스몰 토크에서 연관성이 크기 인 이유는 무엇입니까?

투석기가 nocatch에서 실패하는 이유는 무엇입니까?

j8583 Configparser가 템플릿 필드에 기본값없이 실패하는 이유는 무엇입니까?

ln -s가 기존의 심볼릭 링크 된 디렉토리에 심볼릭 링크를 만들 때 실패한다고 말하지 않는 이유는 무엇입니까?

기본적으로 공개로 전환 한 후 Carrierwave를 통한 Fog로 S3 업로드가 실패하는 이유는 무엇입니까?

다음 스크립트가 실패하는 이유는 무엇입니까?

nvarchar의 기본 크기가 255 (MSSQL Server) 인 이유는 무엇입니까?

Excel 시트의 이름을 기본값 인 "Sheet1"에서 벗어나면이 VBA 코드가 실패하는 이유는 무엇입니까?

udev에서 실행할 때 내 스크립트가 실패하는 이유는 무엇입니까?

BIN_DIR = "~ / bin /"이있는 스크립트에서 mkdir이 실패하는 이유 (해당 파일 또는 디렉토리 없음)는 무엇입니까?

BIN_DIR = "~ / bin /"이있는 스크립트에서 mkdir이 실패하는 이유 (해당 파일 또는 디렉토리 없음)는 무엇입니까?

이 스크립트가 현재 디렉토리에서 작동하지만 경로에 배치되면 실패하는 이유는 무엇입니까?

이 스크립트가 현재 디렉토리에서 작동하지만 경로에 배치되면 실패하는 이유는 무엇입니까?

WatchKit 앱에서 기본보기를 스크롤 할 수있는 이유는 무엇입니까?

Prolog에서 예상치 못한 답변을받는 이유는 무엇입니까? 나열, 곱하기, 재귀

웹팩이 청크로드에 실패하는 이유는 무엇입니까?

Python 기본 사항 set ()은 작동하지만 {}는 실패하는 이유는 무엇입니까?

하이퍼 크레이트에서 여러 번 가져 오기가 실패하는 이유는 무엇입니까?

S 자형에서 다중 클래스 분류가 실패하는 이유는 무엇입니까?