Page MenuHomePhorge

[Python 3] bytes-like strings decoding for saslauthd.py
ClosedPublic

Authored by ghane on Mar 23 2022, 6:54 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Sep 13, 8:44 AM
Unknown Object (File)
Fri, Sep 13, 8:44 AM
Unknown Object (File)
Mon, Sep 9, 3:13 AM
Unknown Object (File)
Sat, Sep 7, 6:24 AM
Unknown Object (File)
Sat, Sep 7, 5:06 AM
Unknown Object (File)
Thu, Sep 5, 3:26 PM
Unknown Object (File)
Sat, Aug 31, 6:14 AM
Unknown Object (File)
Wed, Aug 28, 10:39 AM

Details

Summary

byte-like strings needed to be decoded as text strings

debugging with 2.7 origin code returned unicode strings, but returned in python 3 bytse-like string variables.

Optparse gets text strings from shell.
configparse uses internally text strings.
From version 3.0, python-ldap uses text where appropriate. On Python 2, the bytes mode setting influences how text is handled.

socket.streams and db including bytes-like string code, which will be decoded with this diff, for python-ldap operations.

Diff Detail

Repository
rP pykolab
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

ghane requested review of this revision.Mar 23 2022, 6:54 PM
ghane created this revision.
ghane edited the summary of this revision. (Show Details)
vanmeeuwen subscribed.

I don't understand the case or cases in which this change helps, where it would have otherwise failed.

This revision now requires changes to proceed.Mar 23 2022, 7:13 PM

this is for python 3 as there is more strict type operations between byte and text strings.
https://docs.python.org/3/howto/pyporting.html#text-versus-binary-data

you could search each unicode string in code like:

    login.append(value)

if len(login) == 4:
    realm = login[3]
elif len(login[0].split('@')) > 1:  # this will fail in python 3 as it is mixed bytes and split uses text string , but both is only str in python2 
    realm = login[0].split('@')[1]    # this will fail in python 3 as it is mixed bytes and split uses text string , but both is only str in python2 
else:
    realm = conf.get('kolab', 'primary_domain') # this is text string in python3, in auth/ you would get a mix between login [0] login [1] and realm

I tested the code against versions 2.7, 3.7, 3.8 on debian buster and ubuntu focal.

@vanmeeuwen, how should we proceed here? This is an effort to get the PyKolab codebase into a state where it works with Python 3 without breaking existing systems that are still based on legacy Python 2. Given that background, the commit looks plausible to me.

pykolab/auth/ldap/auth_cache.py
139

I know it's already present in the original code, but the second argument to encode() looks strange to me. Isn't that argument supposed to be a string describing the error-handling scheme? The value 'latin1' wouldn't make any sense in that case.

streams ( Python 2 -> type { str } (bytes string) | Python 3 -> type { class bytes } ) uneven
encode() ( Python 2 -> type { str } (bytes string) | Python 3 -> type { class bytes } ) uneven
decode() ( Python 2 -> type { str } (text string) | Python 3 -> type { class string } ) even

LDAP needs string on search text string to get a result, else the result of search is 0,. <= this is an error, this case is not filtered.
LDAP gets a string on Python 2 as bytes strings are also represented as class string , on Python 3 this is more explicit and byte string are now of class bytes, as in Python 2 bytes() represents the class bytes.

setting table "entries" columns explicitly "domain" from String -> Unicode and "values, keys" from Text -> UnicodeText, would make encode() and decode() unnecessary in auth_cache.py and encoding decoding would handled by sql alchemy.
but if you need the OS locale encode().decode() would do the job. decode() uses as default the OS locale

encode and decode becomes obsolete at the return value as sql, sql alchemy should do the encoding/decoding
see also:

class Entry
    def __init__

line 70-75 checks unicode

Nice, this looks way better.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 15 2022, 11:57 PM
This revision was automatically updated to reflect the committed changes.