MiniDevil As beautiful as a shell
tokenizer.c File Reference

Tokenizes input string: extracts quoted or unquoted text & operators. More...

#include "token.h"
#include "libft.h"
+ Include dependency graph for tokenizer.c:

Detailed Description

Tokenizes input string: extracts quoted or unquoted text & operators.

Definition in file tokenizer.c.

Functions

static char * extract_unquoted (char *str, int *len, t_quote_type *qtype)
 Extract an unquoted text part until any delimiter is reached. More...
 
static char * extract_quoted (char *str, int *len, t_quote_type *qtype)
 Extract a quoted text part, stripping the quotes. More...
 
int process_word_token (char *s, t_token **head)
 Process a word token composed of quoted and unquoted chunks. More...
 
int process_operator_token (char *input, t_token **head)
 Process an operator token. More...
 
t_tokentokenize (char *input)
 Tokenize the input string into a linked list of tokens. More...
 

Function Documentation

◆ extract_unquoted()

static char* extract_unquoted ( char *  str,
int *  len,
t_quote_type qtype 
)
static

Extract an unquoted text part until any delimiter is reached.

Parameters
strInput string
lenCharacters consumed
qtypeIs set to QUOTE_NONE
Returns
Newly allocated substring or NULL if failed

Definition at line 24 of file tokenizer.c.

References is_operator(), is_whitespace(), and QUOTE_NONE.

Referenced by process_word_token().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ extract_quoted()

static char* extract_quoted ( char *  str,
int *  len,
t_quote_type qtype 
)
static

Extract a quoted text part, stripping the quotes.

Finds the matching closing quote and sets qtype accordingly

Parameters
strInput starting at the opening quote
lenTotal characters consumed (including the 2 quotes)
qtypeIs set to QUOTE_SINGLE or QUOTE_DOUBLE
Returns
Content between quotes or NULL if unclosed (+ error msg)

Definition at line 47 of file tokenizer.c.

References QUOTE_DOUBLE, and QUOTE_SINGLE.

Referenced by process_word_token().

+ Here is the caller graph for this function:

◆ process_word_token()

int process_word_token ( char *  s,
t_token **  head 
)

Process a word token composed of quoted and unquoted chunks.

Parses adjacent chunks (for example hello"world"'!' -> 3 connected tokens)

  • Each chunk becomes a separate token with its own quote_type and token->connected = 1 if another chunk follows right after
Parameters
sInput string at word position
headPointer to token lsit head
Returns
Characters consumed or -1 on error (unclosed quotes or alloc failure)

Definition at line 80 of file tokenizer.c.

References add_token(), t_token::connected, create_token(), extract_quoted(), extract_unquoted(), is_operator(), is_whitespace(), t_token::quote_type, and TOKEN_WORD.

Referenced by tokenize().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ process_operator_token()

int process_operator_token ( char *  input,
t_token **  head 
)

Process an operator token.

Determines the operator type and length, creates a token and appends it

Parameters
inputInput at the operator character
headPointer to token list head
Returns
Number of chars consumed (1 or 2) or -1 on allocation failure

Definition at line 118 of file tokenizer.c.

References add_token(), create_token(), and get_operator_token_type().

Referenced by tokenize().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ tokenize()

t_token* tokenize ( char *  input)

Tokenize the input string into a linked list of tokens.

The tokenizer entry point: skips whitespaces and dispatches each segment to process_operator_token() or process_word_token()

Parameters
inputRaw input string
Returns
Head of token list or NULL on error
Note
On error it frees all tokens before returning NULL
Warning
The caller has to free via free_token_list()

Definition at line 144 of file tokenizer.c.

References free_token_list(), is_operator(), is_whitespace(), process_operator_token(), and process_word_token().

Referenced by process_input(), and process_ui_input().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

Go to the source code of this file.