Page MenuHomekolab.org

[Python 3] bytes-like strings decoding for saslauthd.py
Needs RevisionPublic

Authored by ghane on Mar 23 2022, 6:54 PM.

Details

Reviewers
vanmeeuwen
Group Reviewers
PyKolab Developers
Summary

byte-like strings needed to be decoded as text strings

debugging with 2.7 origin code returned unicode strings, but returned in python 3 bytse-like string variables.

Optparse gets text strings from shell.
configparse uses internally text strings.
From version 3.0, python-ldap uses text where appropriate. On Python 2, the bytes mode setting influences how text is handled.

socket.streams and db including bytes-like string code, which will be decoded with this diff, for python-ldap operations.

Diff Detail

Repository
rP pykolab
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

ghane requested review of this revision.Mar 23 2022, 6:54 PM
ghane created this revision.
ghane edited the summary of this revision. (Show Details)
vanmeeuwen requested changes to this revision.Mar 23 2022, 7:13 PM
vanmeeuwen added a subscriber: vanmeeuwen.

I don't understand the case or cases in which this change helps, where it would have otherwise failed.

This revision now requires changes to proceed.Mar 23 2022, 7:13 PM
ghane added a comment.EditedMar 23 2022, 7:34 PM

this is for python 3 as there is more strict type operations between byte and text strings.
https://docs.python.org/3/howto/pyporting.html#text-versus-binary-data

you could search each unicode string in code like:

    login.append(value)

if len(login) == 4:
    realm = login[3]
elif len(login[0].split('@')) > 1:  # this will fail in python 3 as it is mixed bytes and split uses text string , but both is only str in python2 
    realm = login[0].split('@')[1]    # this will fail in python 3 as it is mixed bytes and split uses text string , but both is only str in python2 
else:
    realm = conf.get('kolab', 'primary_domain') # this is text string in python3, in auth/ you would get a mix between login [0] login [1] and realm

I tested the code against versions 2.7, 3.7, 3.8 on debian buster and ubuntu focal.

@vanmeeuwen, how should we proceed here? This is an effort to get the PyKolab codebase into a state where it works with Python 3 without breaking existing systems that are still based on legacy Python 2. Given that background, the commit looks plausible to me.

pykolab/auth/ldap/auth_cache.py
139

I know it's already present in the original code, but the second argument to encode() looks strange to me. Isn't that argument supposed to be a string describing the error-handling scheme? The value 'latin1' wouldn't make any sense in that case.

streams ( Python 2 -> type { str } (bytes string) | Python 3 -> type { class bytes } ) uneven
encode() ( Python 2 -> type { str } (bytes string) | Python 3 -> type { class bytes } ) uneven
decode() ( Python 2 -> type { str } (text string) | Python 3 -> type { class string } ) even

LDAP needs string on search text string to get a result, else the result of search is 0,. <= this is an error, this case is not filtered.
LDAP gets a string on Python 2 as bytes strings are also represented as class string , on Python 3 this is more explicit and byte string are now of class bytes, as in Python 2 bytes() represents the class bytes.

setting table "entries" columns explicitly "domain" from String -> Unicode and "values, keys" from Text -> UnicodeText, would make encode() and decode() unnecessary in auth_cache.py and encoding decoding would handled by sql alchemy.
but if you need the OS locale encode().decode() would do the job. decode() uses as default the OS locale