Question:medium

Consider the following definition of a lexical token id for an identifier in a programming language, using extended regular expressions: \[ \text{letter} \;\;\rightarrow\;\; [A\!-\!Za\!-\!z] \] \[ \text{digit} \;\;\rightarrow\;\; [0\!-\!9] \] \[ \text{id} \;\;\rightarrow\;\; \text{letter (letter | digit)}^* \] Which one of the following Non-deterministic Finite-state Automata with $\epsilon$-transitions accepts the set of valid identifiers? (A double-circle denotes a final state). 

Show Hint

Identifiers in most programming languages must start with a letter, followed by letters or digits. This is why the automaton must force a letter first, and only then allow repetition of both letters and digits.
 

Updated On: Feb 3, 2026
  • A

  • B

  • C

  • D

Show Solution

The Correct Option is C

Solution and Explanation

To determine which Non-deterministic Finite-state Automata (NFA) with ε-transitions accepts the set of valid identifiers based on the given regular expression:

\[ \text{id} \;\;\rightarrow\;\; \text{letter (letter | digit)}^* \]

we need to understand and construct the NFA that adheres to the following rules:

  • An identifier must start with a \(\text{letter}\), which is \([A\!-\!Za\!-\!z]\).
  • After the first letter, it can have any combination of letters or digits \(\text{(letter | digit)}^*\), meaning zero or more occurrences of letters or digits are allowed.

Let's analyze the provided NFAs to determine which one implements this logic correctly.

  1. Option A does not correctly handle the subsequent characters as it doesn't properly link ε-transitions to allow multiple characters as needed.
  2. Option B has separate paths for letters and digits after the initial letter, but it might get stuck in loops and doesn't allow seamless transitions.
  3. Option C correctly starts with a letter, and through ε-transitions, it allows any number of letters or digits to follow, reflecting the regular expression correctly.
  4. Option D is restrictive as it mishandles the post-initial-character logic and misses the star operator's flexibility in repetition.

Analyzing each option carefully, we conclude that option C correctly represents the NFA for the given regular expression. Here’s why:

  • It begins with a \(\text{letter}\).
  • Through ε-transitions, it seamlessly allows any combination of letters and digits after the initial letter.

Therefore, the correct answer is option C, which accurately represents the given regular expression for an identifier.

Was this answer helpful?
0