HTTPServer and RPC

HTTPServer from the cookbook provides nice facilities, including integration with MBED's RPC. Unfortunately there are some limitations in the library. One limitation of HTTPServer RPC that I found quite blocking is the inability to pass strings to the MBED's RPC. Another limitation is the incompatibility with the standard URL query format (in the form http://example.com/rpc/obj/method?variable=value&anothervar=value&...). The latter limitation in not so big a deal if you use javascript in the browser or some other code-based access method. For simple web-based user interface it can be very usefull to give a simple form-driven RPC control.

1. Here's my small modification to the HTTPRPC.h clean() function to address the first limitation. It deals with all %-encoded symbols, as well as quoted text (to pass strings if needed). The URL formats it accepts are:

  • http://example.com/rpc/obj1/method,arg1[,arg2...] (commas)
  • http://example.com/rpc/obj1/method arg1[ arg2...] (spaces)
  • http://example.com/rpc/obj1/method+arg1[+arg2...] (pluses)
  • http://example.com/rpc/obj1/method=arg1[=arg2...]  (eq.signs)
  • any of the arguments can be enclosed in double-quotes, then backslash can be used to escape a double-quote
    "some string text with \"quotes\" inside."
  • any %-encoded symbols are translated (single percent is not recommended as it may get converted, use %25)
    inline void clean(char *str) const {
      char *in = str;
      char *out = str;    // this will do conversions in-place.
      bool inquotes=false;
      bool backslash=false;

      while (*in) {
          *out = *in;

          if (*in=='%' && is_hex(*(in+1)) && is_hex(*(in+2)) ) {
            *out = hex_byte(++in);
            in++;    // in is incremented by 2 total
          } else
          if (!inquotes && (*in==',' || *in=='+' || *in=='=')) {
            *out = ' ';
          }
          if (!backslash && *out == '"') { inquotes = !inquotes; }
          backslash = inquotes && !backslash && (*out == '\\');

          out++;

          in++;
      } // while

      *out = '\0';
    }


2. Here's more advanced implementation that deals with both %-encoded and quoted text, as well as with URL queries format. In addition to the above formats, it accepts URI query as follows:

  • http://example.com/rpc/obj1/method?arg1=val1[&arg2=val2...] (standard form)
  • http://example.com/rpc/obj1/method?val1[&val2...] (abbreviated form)
  • Note that arguments should be in the same positional order as the original method/function, even if their names are given in the query. That limits HTML forms to have fields in that fixed order. All browsers I tried do preserve the ordering. Of course javascript can be used to re-order arguments as needed.
    inline void clean(char *str) const {
      char *in = str;
      char *out = str;    // this will do conversions in-place.
      bool inquotes=false;
      bool backslash=false;
      bool hasquery=false;
      bool cantquery=false;
        // cantquery=true will indicate that the URI already has symbols 
        // incompatible with query, so it disables checking for query.
#if HTTPRPC_USE_URI_FIELDS
      bool infield=false;
      char *field=out;
#endif

      while (*in) {
#if HTTPRPC_USE_URI_FIELDS
        // Check if URI has query part
        // in form "/rpc/obj/method?arg1=val1&arg2=val2&arg3=val3"
        if (!inquotes && !cantquery && !hasquery && *in == '?') {
          hasquery = true;
          *out = ' '; out++;  // delimit base URI part
          infield = true; in++; field=in; continue;
          // New field started. Do nothing yet
        }
        // Check if URI with query part is delimited
        if (!inquotes && !infield && hasquery && *in == '&') {
          *out = ' '; out++;  // delimit current arg
          infield = true; in++; field=in; continue;
          // New field started
        }
        if (infield) {
          // Process the query - skip till '='
          // Also check if it is in form "/rpc/obj/method?val1&val2&val3"
          if (!inquotes && *in == '&') {
            // we did not see a '=' sign. We got abbreviated query - just catch up
            while (field<in) {
              *out = *field;
              if (*field=='%' && is_hex(*(field+1)) && is_hex(*(field+2)) ) {
                *out = hex_byte(++field);
                field++;    // field is incremented by 2 total
              }
              if (!backslash && *out == '"') { inquotes = !inquotes; }
              backslash = inquotes && !backslash && (*out == '\\');

              out++;
              field++;
            }
            
            *out = ' '; out++;  // delimit current arg
            infield = true; in++; field=in; continue;
            // New field started
          } else
          if (!inquotes && *in == '=') {
            infield = false;
            *in = '\0';  // this will mark the field name
//printf("    - field: %s\r\n", field);
// FIXME: here we have a field/arg name. Can we use it to reorder the arguments for the rpc?

          } else {
            // Keep tracking quotes
            char tmp = *in;
            if (*in=='%' && is_hex(*(in+1)) && is_hex(*(in+2)) ) {
              tmp = hex_byte(++in);
              in++;    // in is incremented by 2 total
            }
            if (!backslash && tmp == '"') { inquotes = !inquotes; }
            backslash = inquotes && !backslash && (tmp == '\\');
          }
        } else
#endif    // HTTPRPC_USE_URI_FIELDS
        {
          // Keep processing the stream
          *out = *in;

          if (*in=='%' && is_hex(*(in+1)) && is_hex(*(in+2)) ) {
            *out = hex_byte(++in);
            in++;    // in is incremented by 2 total
            cantquery = !hasquery;   // any %-encoded before '?' means it can't be a query
          } else
          if (!inquotes && !hasquery && (*in==',' || *in=='+' || *in=='=')) {
            *out = ' ';
          }
          if (!backslash && *out == '"') { inquotes = !inquotes; }
          backslash = inquotes && !backslash && (*out == '\\');

          out++;
        }

        in++;
      } // while

#if HTTPRPC_USE_URI_FIELDS
      if (infield) {
        // abbreviated query last arg - catch up
        while (field < in) {
          *out = *field;
          if (*field=='%' && is_hex(*(field+1)) && is_hex(*(field+2)) ) {
            *out = hex_byte(++field);
            field++;    // field is incremented by 2 total
          }
          if (!backslash && *out == '"') { inquotes = !inquotes; }
          backslash = inquotes && !backslash && (*out == '\\');

          out++;
          field++;
        }
      }
#endif    // HTTPRPC_USE_URI_FIELDS

      hasquery = cantquery; // only removes compiler warning
      *out = '\0';
    }

 

The above code samples use few additional parsing functions:

 

    inline bool is_hex(const char a) const {
        return (a>='0' && a<='9') || (a>='A' && a<='F') || (a>='a' && a<='f');
    }
    inline char hex_nib(const char a) const {
        return 0xf & (
          (a>='0' && a<='9') ? (a-'0'   ) :
          (a>='A' && a<='F') ? (a-'A'+10) :
          (a>='a' && a<='f') ? (a-'a'+10) : 0
        );
    }
    inline char hex_byte(const char *str) const {
        return (hex_nib(*str) << 4) | hex_nib(*(str+1));
    }

 

 

What code #2 still does not do, is wrap values in double-quotes so strings would always be safe. That would be handy to pass the strings from HTML forms to the RPC method without javascript. Unfortunately, it is not as straightforward to do: the string conversion is done in-place (using the input buffer) to avoid headaches of memory allocation. Inserting double-quotes will overwrite the input before it is parsed. To implement such logic the API has to be changed somewhat, memory management added, and more parsing complexity then can be designed. It is completely doable. Said that, I would hold on to implementing it until I need it myself.


0 comments

You need to log in to post a comment