Computer Languages - Homework 1
Computer Languages - Homework 1
Please notice that you will get 2 hours supervision but that you are expected to spend more time with these exercises working on your own. See the instructions for the deadline.

Purpose

These computer based exercises should help you gain familiarity with regular expressions and a tool that, given a regular expression, generates a program to recognize the words described by some regular expressions.

Regular Expressions
Where we use JFlex to generate programs that search for certain words and patterns in text files.

  1. Searching for your name.
    The file myName.flex is a source file for a tool called JFlex that is already installed in your computer. As it is you can download it and do the following:
    jflex myName.flex
    javac MyName.java
    java MyName someFileOfText
    
    As you see, the program terminates without any output! In fact, the Java program MyName does nothing interesting! It just goes through every character in the file of text and does nothing for it!
    1. Look at the source file and, with help from the lecture and the manual, identify what things are specified where (for example, the directive %class MyName determines the name of the Java class generated by JFlex)
    2. Add a regular expression describing only your name and make the program print to standard output the line(s) in which your name occurs. (the directive %line enables line counting, the Java variable yyline contains the current line while scanning the input file.)
    3. Look at the file MyName.java and compare it with what you wrote in myName.flex.
  2. Searching for email addresses
    1. Use JFlex as before to generate a program that prints to standard output e-mail addresses occurring in a file of text. An e-mail address is formed by a compund name followed by @ followed by a compound server. The parts in the compounds are separated by dots. Each part can be described by:
      name     = [a-zA-Z][a-zA-Z_0-9\-]* 
      server   = [a-zA-Z][a-zA-Z]* 
      
      Look at the manual to see how to make definitions and how to use them!
    2. Modify your regular expressions so that only e-mail addresses from Halmstad University are selected.
  3. Searching for telephones
    Do the same as in the previous exercise but for telephone numbers instead. Try to capture different patterns: for local calls, for national calls, for international calls.
  4. Java keywords
    In the on-line specification of the lexical structure of the programming language Java you can find a list of the keywords of Java (chapter 3.9). Use JFlex to write a program that counts the occurrences of Java keywords in a file of text.

    Hint Use the directive %implements with the name of a Java interface. In that interface declare some integer constants for the keywords, then you can use these names to index an array in your program to count the occurrences of each keyword. You might also want to declare an array of strings in the interface indexed by these constants so that you can print keyword and the number of occurrences. A short version of the interface could look like:

    interface Kw{
        int ABSTRACT = 0;
        int BOOLEAN  = 1;
        int BREAK    = 2;
    
        String[] kws = {"abstract","boolean","break"};
    }
    
    Check this The command wc in unix counts the number of lines, words and characters in a file. Compare the number of keywords with the number of words in a Java program. What are the other words?

Submission
Who should submit? What should be submitted? How to submit? When?

  1. You have worked in groups of 3 or 4, submit together. That is, submit once with all your names!
  2. Submit a zipped file with all the JFlex sources and at least a test file of text for each exercise.
  3. Submit by sending an email to jerker.bengtsson|at|hh|dot|se with subject assignment 1. Sign the mail with all your names and e-mail addresses. Attach only one file!
  4. Deadline. Your mail should arrive on thursday January 29th at 8.00 a.m. at the latest!