Improve gcc.py parse_output()

23 Jan 2018

Hi,

recently I searched for the Code where the arm-gcc error/warning messages are parsed by mbed-cli compile. After some hours of trail and fail I found the python method where it was parsed and I think this is done really complicated. I changed the code from:

gcc.py error/warning parsing now

[...]
    DIAGNOSTIC_PATTERN = re.compile('((?P<file>[^:]+):(?P<line>\d+):)(\d+:)? (?P<severity>warning|[eE]rror|fatal error): (?P<message>.+)')
    INDEX_PATTERN  = re.compile('(?P<col>\s*)\^')
[...]
    def parse_output(self, output):
        # The warning/error notification is multiline
        msg = None
        for line in output.splitlines():
            match = self.DIAGNOSTIC_PATTERN.search(line)
            if match is not None:
                if msg is not None:
                    self.cc_info(msg)
                    msg = None
                msg = {
                    'severity': match.group('severity').lower(),
                    'file': match.group('file'),
                    'line': match.group('line'),
                    'col': 0,
                    'message': match.group('message'),
                    'text': '',
                    'target_name': self.target.name,
                    'toolchain_name': self.name
                }
            elif msg is not None:
                # Determine the warning/error column by calculating the ^ position
                match = self.INDEX_PATTERN.match(line)
                if match is not None:
                    msg['col'] = len(match.group('col'))
                    self.cc_info(msg)
                    msg = None
                else:
                    msg['text'] += line+"\n"

        if msg is not None:
            self.cc_info(msg)

to:

gcc.py error/warning parsing now

[...]
    DIAGNOSTIC_PATTERN = re.compile('((?P<file>[^:]+):(?P<line>\d+):)(?P<col>\d+):? (?P<severity>warning|[eE]rror|fatal error): (?P<message>.+)')
[...]
    def parse_output(self, output):
        # The warning/error notification is multiline
        msg = None
        for line in output.splitlines():
            match = self.DIAGNOSTIC_PATTERN.search(line)
            if match is not None:
                if msg is not None:
                    self.cc_info(msg)
                    msg = None
                msg = {
                    'severity': match.group('severity').lower(),
                    'file': match.group('file'),
                    'line': match.group('line'),
                    'col': match.group('col'),
                    'message': match.group('message'),
                    'text': '',
                    'target_name': self.target.name,
                    'toolchain_name': self.name
                }
        if msg is not None:
           self.cc_info(msg)

I think this is more straight forward and does a better job, because sometimes the column parsing with counting space just fails (i.e. for FATFileSystem.cpp it shows for the warning at line 661:0, but in fact it is 661:16) and to be honest its more a hacky solution. In my code it will be parsed from the arm-gcc output line instead of counting spaces and the second regex is not necessary.

What do you guys thinking? Should I do a Pull Request?