git.samba.org - samba.git/commit

author	Noel Power <noel.power@suse.com>
	Wed, 5 Sep 2018 14:23:01 +0000 (15:23 +0100)
committer	Noel Power <npower@samba.org>
	Mon, 5 Nov 2018 19:05:24 +0000 (20:05 +0100)
commit	16596842a62bec0a9d974c48d64000e3c079254e
tree	dfa5c18f0d7f748627aa4c6f1c6d4148f066f79b	tree
parent	19a459bac3932427afc65661d06dd7a6eda8865e	commit \| diff

python/samba/gp_parse: Fix mulitple encode step with write_section

In python2 as far as I can see GptTmplInfParser.write_binary more
or less works by accident.

write_binary creates a writer for the 'utf8' codec, such a writer
should consume unicode and emit utf8 encoded bytes. This writer
is passed to each of the sections managed by GptTmplInfParser as
follows

    def write_binary(self, filename):
        with codecs.open(filename, 'wb+',
                         self.encoding) as f:
            for s in self.sections:
                self.sections[s].write_section(s, f)

And each section type itself is encoding its result to 'utf-16-le'
e.g.
    class UnicodeParam(AbstractParam):
         def write_section(self, header, fp):
            fp.write(u'[Unicode]\r\nUnicode=yes\r\n'.encode(self.encoding)

But this makes little sense, it seems like sections are encoded to one
encoding but the total file is supposed to be encoded as ut8??? Also
having an encoding per ParamType doesn't seem correct.

Bizarely in PY2 this works and it actually encodes the whole file as utf-16le
In PY3 you can't do this as the writer wants to deal with strings not bytes
(after the extra encode phase in 'write_section'.

So, changes here are to remove the unnecessary encoding in each 'write_section'
method, additionally in GptTmplInfParser.write_binary the
codecs.open call now uses the correct codec (e.g. 'utf-16-le') to write

Signed-off-by: Noel Power <noel.power@suse.com>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>