Novel thermostable GH5_34 arabinoxylanase with an atypical CBM6 displays activity on oat fiber xylan for prebiotic production

Abstract Carbohydrate active enzymes are valuable tools in cereal processing to valorize underutilized side streams. By solubilizing hemicellulose and modifying the fiber structure, novel food products with increased nutritional value can be created. In this study, a novel GH5_34 subfamily arabinoxylanase from Herbinix hemicellulosilytica, HhXyn5A, was identified, produced and extensively characterized, for the intended exploitation in cereal processing to solubilize potential prebiotic fibers: arabinoxylo-oligosaccharides. The purified two-domain HhXyn5A (catalytic domain and CBM6) demonstrated high storage stability, showed a melting temperature Tm of 61°C and optimum reaction conditions were determined to 55°C and pH 6.5 on wheat arabinoxylan. HhXyn5A demonstrated activity on various commercial cereal arabinoxylans and produced prebiotic AXOS, whereas the sole catalytic domain of HhXyn5A did not demonstrate detectable activity. HhXyn5A demonstrated no side activity on oat β-glucan. In contrast to the commercially available homolog CtXyn5A, HhXyn5A gave a more specific HPAEC–PAD oligosaccharide product profile when using wheat arabinoxylan and alkali extracted oat bran fibers as the substrate. Results from multiple sequence alignment of GH5_34 enzymes, homology modeling of HhXyn5A and docking simulations with ligands XXXA3, XXXA3XX and X5 concluded that the active site of HhXyl5A catalytic domain is highly conserved and can accommodate both shorter and longer ligands. However, significant structural dissimilarities between HhXyn5A and CtXyn5A in the binding cleft of CBM6, due to the lack of important ligand-interacting residues, is suggested to cause the observed differences in substrate specificity and product formation.


Introduction
Enzymatic processing of biomass is a widespread approach and an important step in biorefineries and food industry, in order to break down recalcitrant structures and valorize individual process streams to various building-block chemicals, speciality chemicals and consumer products (Silva et al. 2018;Ostby et al. 2020). Carbohydrate acting enzymes such as glycoside hydrolases (GH) is one example of an essential group of enzymes used on forest and agricultural materials for this purpose, employed to hydrolyze carbohydrates into shorter oligosaccharides and monosugars (Gilbert et al. 2008). Endo-β-1,4-xylanases (EC 3.2.1.8) catalyze the cleavage of β-1,4 linkages in the backbone of xylans, their activity varying depending on the type and extent of substitutions on the xylan and can be found in GH families such as GH10, GH11, GH30 and GH5. Xylanases that specifically cleave the hemicellulose arabinoxylan (AX) present in many plants are especially important for valorizing fiber-rich side-streams from processing of cereals and grains containing high amounts of AX (Nordberg Karlsson et al. 2018). Promising end products from the enzymatic hydrolysis of AX are xylo-oligosaccharides (XOS) and arabinoxylo-oligosaccharides (AXOS), which have shown potential as prebiotics by selectively stimulating the growth of probiotic bacteria, for example, Lactobacillus and Bifidobacterium species (Sajib et al. 2018;Bhattacharya et al. 2020). The probiotic bacteria can, in turn, help to maintain a healthy gut and microbiota by producing short-chain fatty acids and reducing cancer cell proliferation (Broekaert et al. 2011).
Depending on the enzymes employed in the process, the resulting XOS and AXOS can have different lengths and substituent patterns, which are important aspects to consider when customizing the final product for stimulating selected bacterial species. Xylanases from family GH10 and GH11 are commonly used for AX degradation (Nordberg Karlsson et al. 2018). However, their activity and specificity vary substantially, depending on the biomass origin and substrate purity, as these enzymes can be more or less restricted by extensive substitution (Linares-Pasten et al. 2018;Schmitz et al. 2022b). In addition, xylanases from families GH10 and GH11 often display some side activity on mixed linkage β-glucan (Shi et al. 2010), a valuable fiber abundant in barley and oat. Oat βglucans of high molecular weight have proven to be health beneficial by lowering blood glucose levels and cholesterol (Wang and Ellis 2014), explaining why the side activity could be undesired for certain cereal processes.
Family GH5 subfamily 34 (GH5_34) consists mainly of multimodular, AX-specific endo-β-1,4-xylanases, or arabinoxylanases (EC 3.2.1.-), which are reported to require an O3-linked arabinose substituent in the active site (in a subsite termed −2 * , adjacent to the −1 subsite) to be able to cleave the xylan backbone (Correia et al. 2011;Labourel et al. 2016). These enzymes are also of interest because of their ability to accommodate several arabinose substituents in the active site (in the glycone as well as the aglycone subsites), thereby possessing the catalytic activity to hydrolyze highly decorated substrates, aiding in the complete degradation of recalcitrant fibers and yielding unique AXOS product profiles (Correia et al. 2011;Labourel et al. 2016;Falck et al. 2018;Schmitz et al. 2022b). Arabinoxylanases are a common part of the cellulosome, a multi-enzyme complex used by microorganism for extracellular degradation of plant cell walls. It is therefore typical for these enzymes to include a dockerin domain, to mediate stable integration of the enzyme, as well as different carbohydrate-binding modules (CBM) in addition to their catalytic domain, to be able to bind various hemicellulose substrates (Bayer et al. 2004).
The enzymes are attributed to GH5_34 subfamily based on the sequence as well as structural similarity and catalytic conservation, having two glutamic acids acting as the catalytic nucleophile and proton donor, respectively (Correia et al. 2011;Labourel et al. 2016). To date, only nine GH5_34 subfamily sequence entries are indexed in the Carbohydrate Active enZymes (CAZy) database (http://www. cazy.org; Lombard et al. 2013), and only four enzymes have been characterized. The extensively characterized enzyme CtXyn5A from Acetivibrio thermocellus (initially Clostridium thermocellum) (Tindall 2019) remain as the only structurally characterized GH5_34 enzyme (Brás et al. 2011;Correia et al. 2011;Labourel et al. 2016), and this enzyme is commercially available in analytical amounts. Three other GH5_34 enzymes have previously shown activity on AX from rye, wheat and corn (Labourel et al. 2016), but these candidates have not been studied in further detail. Therefore, it is of interest to prioritize and characterize novel GH5_34 enzymes, in order to extend the knowledge of this small GH5 subfamily and identify industrially relevant enzymes with unique specificities and suitable traits for production of AXOS from agricultural biomass.
In this study, a novel GH5_34 subfamily arabinoxylanase, here termed HhXyn5A, was identified. The gene for HhXyn5A was initially assigned as originating from Herbinix hemicellulosilytica, an anaerobic thermophilic cellulose-converting bacterium isolated from a biogas reactor metagenome (Koeck et al. 2015). HhXyn5A was here produced, purified and characterized based on stability and substrate specificity, at favorable reaction conditions for cleavage of AX. Several other hemicellulose degrading enzymes have previously been characterized from H. hemicellulosilytica (attributed to families GH10, GH11, GH43 and GH51) (Mechelke et al. 2017); however, the arabinoxylanase investigated here has previously been overlooked. Moreover, in this study, the product profile of HhXyn5A was compared to the previously characterized CtXyn5A (Labourel et al. 2016), in order to evaluate its potential use in biomass processing, and interesting differences were identified. To allow deeper investigation of interacting residues, a tertiary structure model was created for performing docking simulations with different AXOS ligands. This is the first GH5_34 arabinoxylanase to be investigated from H. hemicellulosilytica, and one of few GH5_34 enzymes to be extensively characterized.

Cloning, expression and purification of HhXyn5A variants
Production of the full-length multi-modular enzyme HhXyn5A-FULL variant was not successful (data not shown), whereas the production and purification of the truncated HhXyn5A variants ( Figure 1A, including the two-domain variant HhXyn5A (54 kDa) and the sole domain variant HhXyn5A-CAT (37 kDa)) resulted in high yields (0.15 and 0.23 g/L culture for HhXyl5A and HhXyl5A-CAT, respectively) of soluble recombinant proteins ( Figure 1B). A single-step purification by immobilized metal ion affinity chromatography (IMAC) resulted in purity near to homogeneity of both HhXyn5A variants ( Figure 1B). The purified enzyme variants were stably soluble even after storage for 3 months at 4 • C, both in elution and reaction buffers.

Enzyme activity optimum on wheat arabinoxylan
The most suitable reaction conditions regarding pH and temperature for a 10 min reaction on wheat arabinoxylan (WAX) were investigated for both HhXyn5A and CtXyn5A using a full factorial design experimental set-up, where the two factors were varied simultaneously. The HhXyn5A-CAT variant did not demonstrate detectable activity and was hence not possible to evaluate in this regard. The models created from the experimental data had high validity scores; R 2 = 0.82 and Q 2 = 0.68 for HhXyn5A, R 2 = 0.79 and Q 2 = 0.61 for CtXyn5A. The resulting response surface plots ( Figure 2) imply that pH and temperature are non-correlated factors for activity, determined as reducing end formation by the DNS assay. The model also suggests that for maximal activity over a 10 min reaction on WAX, a pH around 6.5 should be used for both enzymes, whereas the temperature resulting in highest activity is around 50 • C for HhXyn5A and 60 • C for CtXyn5A. These optimum conditions were validated for both enzymes (data not shown) and were thereafter used when performing further experiments.

HhXyn5A thermostability and irreversible deactivation
To further investigate the stability of HhXyn5A, the melting temperature (T m ) and activity loss over time (irreversible deactivation) were studied. The nanoscale differential scanning fluorometry (nanoDSF) derived T m for HhXyn5A and CtXyn5A (Table 1) show protein unfolding around 61 and 73 • C, respectively, at pH 6.5. The sole catalytic domain construct HhXyn5A-CAT displayed a higher T m than the two-domain variant HhXyn5A. Addition of the WAX substrate resulted in an increase in T m of up to 3 • C for HhXyn5A (Table 1), indicating that the enzyme substrate interactions stabilized the structure. Irreversible deactivation studies (without substrate) showed that at its T m 61 • C at pH 6.5, HhXyn5A lost 50 percent activity already after 4 min ( Figure 3) while being stable at 50 • C for more than 36 h. This coincides with the boundaries of the optimization model and supports the chosen reaction conditions. There was no difference in either activity or stability, when HhXyn5A was eluted in the enzyme formulation buffer or in the CtXyn5A formulation buffer (data not shown). This shows that the addition of imidazole, glycerol, CaCl 2 and NaCl did neither affect the stability nor the activity of HhXyn5A under these reaction conditions.
HhXyn5A-CAT showed no detectable activity on any of the substrates tested, suggesting that the CBM6 is necessary for catalytic function. The high performance anion exchange chromatography with pulsed amperometric detector (HPAEC-PAD) chromatograms show that HhXyn5A produced a quite different oligosaccharide profile pattern than CtXyn5A, after enzymatic reaction for 24 h on WAX and OBF ( Figure 4). Both enzymes produced product peaks appearing at the same retention time as standards A 3 X and XA 3 XX, as well as a pattern of oligosaccharides with longer retention times. However, CtXyn5A produced additional shorter quantifiable XOS products, for example peaks eluting at the same retention times as X 4 to X 6 , as well as multiple unknown oligosaccharides appearing at shorter retention times, some of which showed corresponding retention times to maltooligosaccharides (standards not shown).

Alignment of GH5_34 enzymes
The GH5_34 enzymes currently indexed in the CAZy database were compared with the novel HhXyn5A using a multiple sequence alignment of their catalytic GH5 modules ( Figure 5) and CBM6 ( Figure 6). For the catalytic GH5 modules, the two glutamic acid catalytic residues conserved within the family ( Figure 5, yellow columns) are present in HhXyn5A, as well as other residues involved in hydrogen bonding or hydrophobic interactions of AXOS for CtXyn5A ( Figure 5, red columns), for example, residues Glu68, Tyr92 and Asn139, which are important for arabinose interaction  in the −2 * subsite (Labourel et al. 2016). Additionally, the GH5_34 enzymes from Verrucomicrobiae bacterium (VbGH5), Gonapodya prolifera (GpGH5) and Acetivibrio cellulolyticus (AcGH5) from the CAZy database all have similar residue conservation and have all been shown to hydrolyze commercial arabinoxylans from wheat, rye and corn, displaying similar (A)XOS product pattern to CtXyn5A (Labourel et al. 2016). It should be noted that the tryptophan residue present in the amino acid sequence in both CtXyn5A and HhXyn5A ( Figure 5, purple column), is missing in the tertiary crystal structure of CtXyn5A (PDB 2Y8K and PDB 5LA2), instead presenting gaps in this area.
A multiple sequence alignment was additionally generated for the CBM6, found as auxiliary modules of the GH5_34 enzymes, together with the well-characterized CBM6 from GH11 xylanase from A. thermocellus (initially C. thermocellum) (CtGH11-CBM6) and the CBM6 from HhXyn5A (HhGH5-CBM6) ( Figure 6). The aromatic residues (red columns) and hydrogen binding residues (green columns) that have previously been identified as important residues for XOS binding in CBM6 in general (Charnock et al. 2000) and for CtGH11-CBM6 specifically (Czjzek et al. 2001;Pires et al. 2004) are not conserved. Further investigation of the structural implications of these amino acid changes was performed using homology modeling and docking studies (section 2.6).

HhXyn5A homology modeling and ligand docking
In order to investigate the structure of HhXyn5A and compare it to the structures of CtXyn5A and CtGH11-CBM6, a homology model was created and used for docking simulations with three ligands: XXA 3 and XXXA 3 XX in the active site of the The HhXyn5A homology model displayed a tertiary fold and active site structure very similar to the CtXyn5A crystal structure. Docking simulations using XXXA 3 as ligand further revealed a ligand fit comparable to previous crystallographic complex and simulations presented for CtXyn5A (Figure 7) (Brás et al. 2011;Correia et al. 2011;Labourel et al. 2016;Falck et al. 2018). The Xylp-α-1,3-Araf unit of the ligand binds in a pocket containing the −2 * and − 1 subsites, whereas the other Xylp sugars make weak limited interaction with the enzyme (Figure 7B), similar to what has previously been suggested for CtXyn5A ( Figure 7B) (Labourel et al. 2016). The modeled HhXyn5A-ligand complex formed hydrogen bonding from Glu19, Asn86, Gly87, Asn90 and Asn121 to arabinose in the −2 * subsite and xylose in the −1 subsite, which correspond to the hydrogen bonding shown for CtXyn5A residues Glu68, Asn135, Gly136, Asn139 and Asn170 ( Figure 7D). In addition, the hydroxyl groups of the −2 to −4 Xylp residues are pointing outwards into the solvent for both enzymes, indicating that a substrate with multiple arabinose substitutions could also be accommodated.
The residue Trp236 of HhXyn5A, responsible for the bulky loop region close to the active site ( Figure 7C), is also present in the amino acid sequence of CtXyn5A ( Figure 5, purple column). However, the tertiary crystal structure is missing parts of the residues in this region, instead presenting gaps and therefore making the loop adopt a smaller form ( Figure 7B), compared with HhXyn5A. The longer ligand XXXA 3 XX can nevertheless be accommodated in the active site and shows aromatic stacking interaction with Trp126 in the +2 subsite ( Figure 7E).
In contrast to the catalytic domain, the CBM6 for the two enzymes are quite different, structurally and in amino acid sequence. Previous studies on CBM6, including a study on CtGH11-CBM6, have suggested that XOS substrates primarily bind in a region termed cleft A, located in the loop region between the two β-sheets of the classical jelly roll fold ( Figure 8A-B). The hydrophobic interactions and stacking between the ligand sugars and aromatic residues  in this cleft been have shown to be essential for substrate binding and specificity (Charnock et al. 2000;Czjzek et al. 2001;Pires et al. 2004). As previously seen in the multiple sequence alignment, HhGH5-CBM6 is missing the important aromatic residues for X 5 stacking present in CtGH11-CBM6 and the CBM6 of CtXyn5A (CtGH5-CBM6) ( Figure 8C-D). The CtGH11-CBM6 aromatic residues Trp92 and Tyr34, corresponding to Phe478 and Trp424 in CtGH5-CBM6, are replaced with the acidic Asp430 and basic His376 in HhGH5-CBM6 ( Figure 8D). Residue Asn120, important for hydrogen bonding of X 5 in CtGH11-CBM6 (Pires et al. 2004), is also replaced by a longer Lys459 residue in HhGH5-CBM6, as well as residue Asn93, interacting with X 5 at subsites 4 and 5, which is replaced by the smaller Gly431.
The amino acid differences in HhGH5-CBM6 result in a surface alteration of cleft A compared to CtGH11-CBM6 and CtGH5-CBM6 ( Figure 8A-C), as well as quite different ligand binding as suggested from docking simulations of X 5 in HhGH5-CBM6. The second Xylp residue does form a stacking interaction with His376; however, the loss of the second aromatic residue Trp92 (CtGH11-CBM6 numbering) and other interacting amino acids has made the well-defined cleft disappear ( Figure 8C). For CtGH5-CBM6, the differences in amino acid sequence compared to CtGH11-CBM6 does not seem to result in tertiary structural changes in cleft A, suggesting that X 5 will bind in a similar way as illustrated by the overlay of the ligand in the potential binding cleft between the two aromatic residues Phe478 and Trp424 ( Figure 8B).  The CBM6 family displays another binding region, cleft B, located on the flat β-sheet surface, which, in contrast to cleft A, is reported to be specific for cellulose and glucan recognition (Charnock et al. 2000). When performing a global docking simulation of X 5 on HhGH5-CBM6, one of the binding results with lowest energy was presented close to cleft B, where Trp382 could potentially make a hydrophobic interaction with X 5 ( Figure 8E). However, the binding site is shallow and is missing a second aromatic residue for potential stacking. No alternative deep cleft region, corresponding to cleft A with two aromatic residues, seems to be available at the HhGH5-CBM6 surface.

Discussion
The novel GH5_34 arabinoxylanase HhXyn5A could be produced at high yields with long storage stability, which makes the enzyme a potential candidate for large scale production, as no domain splitting or aggregation was observed. In contrast, CtXyn5A formulation is dependent on imidazole to prevent enzyme aggregation and precipitation (Schmitz et al. 2022a). HhXyn5A also displayed high stability at optimal reaction conditions of 55 • C and pH 6.5 (Table 1 and Figure 3). The observed temperature optimum for HhXyn5A between 40 and 60 • C (Figure 2A) is to be expected, since H. hemicellulosilytica was isolated from a biogas reactor incubating at 55 • C (Koeck et al. 2015). The higher T m observed for the sole catalytic domain variant HhXyn5A-CAT indicates that unfolding is starting with the CBM6 or in the hinge region between the two domains. It can however not be excluded that some type of misfolding occurred prior to the nanoDSF analysis, as the HhXyn5A-CAT construct did not demonstrate detectable activity on all substrates tested.
HhXyn5A displayed comparable activity to CtXyn5A on WAX and RAX, and neither enzyme was active on BX. Thus, HhXyn5A has the requirement of an arabinose substituent for catalytic function, in accordance with the profile of CtXyn5A, that has previously shown to be unable to hydrolyze nonsubstituted linear xylooligosaccharides (Labourel et al. 2016), corroborating the importance of arabinose substituents to bind in the −2 * subsite of the active site in GH5_34 subfamily.
The peaks observed for both HhXyn5A and CtXyn5A at the retention time of the known standard A 3 X (Figure 4) is instead likely to correspond to XA 3 , with the Araf residue on the reducing end Xylp, which is a proven product of CtXyn5A (Correia et al. 2011). As the compounds in principle have identical structures, it is likely that A 3 X and XA 3 have similar retention times. The peak eluting at the same retention time as standard XA 3 XX is unlikely to correspond to this product, as an Araf residue is required in the −2 * subsite of the enzyme, at the non-reducing end of the ligand. Therefore, XXXA 3 is a more likely product to result from HhXyn5A and CtXyn5A hydrolysis, and may display a similar retention time. However, the coelution of XA 3 XX and XXXA 3 cannot be confirmed due to lack of standards.
No hydrolysis of BG was observed using HhXyn5A, which is a beneficial trait for selective processing of cereal arabinoxylan to oligosaccharides, keeping polymeric BG of a high molecular weight intact. Polymeric BG is a valuable and desired product due to its health benefits, for example, lowering blood glucose levels and cholesterol (Wang and Ellis 2014).
The difference in the product profile shown on OBF and WAX (Figure 4) for the two enzymes is an interesting and somewhat unexpected finding, considering that the overall active site structure of HhXyn5A is very similar to CtXyn5A, based on both sequence alignment ( Figure 5) and tertiary structure modeling (Figure 7). The homology model of HhXyn5A and the docking simulations performed in the active site of the catalytic domain, confirmed conservation of the active site residues (Figure 5) and demonstrated the possibility of the short ligand XXXA 3 to be accommodated in a similar way as presented for CtXyn5A ( Figure 7B-D) (Labourel et al. 2016). Considering the discrepancy between the CtXyn5A amino acid sequence and its determined crystal structure in regards to the missing tryptophan residue (Trp175), HhXyn5A does also not present any obvious conformational differences to CtXyn5A in proximity to the active site that could potentially sterically hinder longer or more substituted AXOS, and this does not explain the heterogeneity in product formation between the enzymes. Although the placement of the affinity tag for IMAC purification differs between HhXyn5A (N-terminal placement) and CtXyn5A (Cterminal placement), this would likely not affect the activity or product profile of the enzymes as the affinity tag sequence are distantly located from the active site in the tertiary structure ( Figure 7A). C-terminal affinity tag placement substantially reduced storage and integrity stability of HhXyn5A (data not shown). Docking of a longer ligand, XXXA 3 XX, to HhXyn5A resulted in comparable accommodation as seen in the docking simulation with CtXyn5A ) and showed similar aromatic stacking interaction with Trp126 (Trp175 in CtXyn5A) in the +2 subsite ( Figure 7D). Based on these docking results, it is likely that other factors have a greater influence on the difference in activity profile shown between the two enzymes than the GH5 active site architecture.
The non-conserved sequence corresponding to cleft A in HhGH5-CBM6 (Figure 6) is likely a factor of importance for the activity profile. Cleft A has previously been identified as important for XOS binding in the CBM6 family, and differences in sequence conservation has been proposed as an explanation for the diverse binding specificity seen within this CBM family (Czjzek et al. 2001;Michel et al. 2009). In this case, the missing important residues in HhGH5-CBM6 lead to an alteration of the tertiary structure, loss of the cleft and loss of potential for multiple stabilizing aromatic stacking interactions of ligands ( Figure 8C-D). HhGH5-CBM6 might therefore rely more on hydrogen bonding and polar interactions for binding its substrate, for example through hydrogen bonding with Glu363, similar to the interaction displayed between X 4 and Glu20 for CBM6 of endoglucanase 5A from Cellvibrio mixtus (Pires et al. 2004). The more open cleft A structure of HhGH5-CBM6 may allow binding of xylans with various types of substituents. On the other hand, residues Lys378 and Asp456 of HhGH5-CBM6 could sterically hinder longer substrates than the X 5 ligand used here, at the reducing end ( Figure 8C).
Depending on the substitution pattern of the AX substrate, the HhGH5-CBM6 structure may be less adapted to bind certain regions of the AX polymer than corresponding CtGH5-CBM6, which could result in the additional end products seen on WAX and OBF using CtXyn5A as compared with HhXyn5A (Figure 4). Further analysis of this would however require structural data of HhGH5-CBM6. Moreover, the oat fiber substrate is not a commercial product, and the remaining impurities, potentially containing other fibers or structures connected to the oat AX, may affect the enzymatic activity (Schmitz et al. 2022b). This could be another reason for the differing product patterns between the two enzymes, where the oat substrate might be inaccessible for cleft A in CtGH5-CBM6, whereas HhGH5-CBM6 structure might be more suitable for binding. This hypothesis is supported by previously seen effects of other CMB6 influencing the catalytic activity on more recalcitrant or insoluble substrates (Charnock et al. 2000).
Alternatively, the AX substrates might bind in a different location than cleft A in HhGH5-CBM6, potentially close to cleft B as demonstrated by docking simulations (Figure 8E), albeit there is no clear cleft structure or potential for multiple aromatic stacking interactions in this region. Previous studies show that the secondary structure of the binding site is fundamental for substrate selectivity (Johnson et al. 1996;Simpson et al. 2000) and that a single mutation of the residues in the binding cleft of a xylan-binding CBM can change the orientation of other residues and may result in changed or decreased specificity and even loss of activity (Simpson et al. 2000;Czjzek et al. 2001). It is thus most likely that the considerable difference in CBM6 cleft A between HhXyn5A and CtXyn5A is an important factor responsible for the different product patterns resulting after the respective enzymes' activities on WAX and OBF.
Differences seen in product profile and substrate specificity has previously been assigned to differing domain organization of GH5_34 enzymes, for example, additional CBMs with alternative specificity or other catalytic domains with functions that mediate serial degradation of complex materials or increase activity on insoluble substrates (Charnock et al. 2000;Labourel et al. 2016). It would therefore also be of interest to investigate how other CBMs and the various domains, not related to attachment to the cellulosome, of the full-length HhXyn5A enzyme can influence the stability, activity and substrate specificity, for example, for the direct application of the enzyme in industrial oat fractionation processes, in order to increase the yield of soluble fiber and prebiotic AXOS in novel oat products.
To summarize, in this study the first GH5_34 arabinoxylanase from H. hemicellulosilytica was extensively characterized regarding stability, substrate specificity, product profile and substrate interaction. The novel GH5_34 arabinoxylanase HhXyn5A can be produced in high yields in Escherichia coli, and is stable during storage after purification and retains its activity at processing temperatures for several days, making it an interesting candidate for applied commercial use. HhXyn5A displays activity on commercial AX substrates from various grains, comparable to the activity of the commercially available homolog CtXyn5A with no side activity on BG. In contrast to CtXyn5A, HhXyn5A gave a more specific oligosaccharide product profile when using WAX as substrate, with longer XOS products. In addition, HhXyn5A was shown to be efficient on commercial WAX and RAX, as well as extracted oat fibers, producing several AXOS compounds with potential prebiotic effect, previously shown to be consumed by a probiotic Bifidobacterium species (Bhattacharya et al. 2020). The similarities of the GH5 active sites and the dissimilarities of the CBM6 binding sites between the homologs HhXyn5A and CtXyn5A suggest that it is the difference in the substrate binding sites of the CBMs that results in different product profiles. Further structural studies are needed to elucidate whether the difference is due to steric hindrance in cleft A of HhGH5-CBM6, or due to the presence of an alternative recognition site for AXOS in this enzyme.

Enzyme candidate selection and sequence modifications
The truncated two-domain sequence of GH5_34 arabinoxylanase CtXyn5A (RefSeq WP_003513669.1 (Wilson et al. 2013)) from A. thermocellus (initially C. thermocellum) was used as query to search for similar sequences by blastp in the NCBI non-redundant protein sequence database under default parameter values. A putative GH5_34 subfamily arabinoxylanase (GenBank NLC19267.1 (Campanaro et al. 2020)) from Clostridiales bacterium (initially H. hemicellulosilytica) with 67.4 percent sequence identity (98 percent coverage) to CtXyn5A was selected. The GX757_08645 gene (Gen-Bank JAAZDK010000174.1) encoding the full-length multimodular enzyme HhXyn5A-FULL variant as well as two truncated sequence variants were cloned ( Figure 1A); one consisted of the catalytic GH5 domain and CBM6 for simplicity termed HhXyn5A, and the other construct HhXyn5A-CAT, consisted of sole catalytic domain. All constructs were cloned with an N-terminal His 6 -tag encoding sequence. The commercial GH5 arabinoxylanase CtXyn5A (product number CZ00601, NZYTech) is also a truncated two-domain enzyme (analogous to HhXyn5A) produced in E. coli.

Cloning, expression and purification of recombinant HhXyn5A variants
Sequences encoding HhXyn5A and HhXyn5A-CAT were vector-adapted maintaining the native codon landscape, synthesized and cloned (Bio-Cat, Germany) in pET-21b(+) vectors (Merck). Heterologous overexpression was performed in E. coli BL21(DE3) (Merck). The enzyme targets were expressed by initial cell cultivation in Lysogeny Broth (LB)-Lennox at 37 • C with shaking at 200 rpm until OD 600 = 0.6-0.8. Isopropyl β-D-galactopyranoside was then added to 0.2 mM concentration to induce expression at 30 • C with shaking at 200 rpm for 4 h. The cells were harvested by centrifugation at 10,500 × g for 15 min at 4 • C, washed and re-suspended in lysis buffer (50 mM HEPES-NaOH pH 7.4 (RT), 500 mM NaCl, 50 mM imidazole and 5 percent (v/v) glycerol). After lysis of the cells by sonication (0.5 cycle for 10 min at amplitude 60 percent) using a UP 400S homogenizer (Hielscher Ultrasonics), the obtained lysate was clarified by centrifugation at 26,800 × g for 30 min at 4 • C and the supernatant was filtered through 0.22 μm pore size filters.
Purification of proteins from clarified lysate was performed applying IMAC, with Ni 2+ serving as ligand, with an ÄKTA Start system (GE Healthcare Life Sciences) using a 1 mL HisTrap HP (7 × 25 mm) column (Cytiva). The protein with N-terminal affinity tag bound to column in lysis buffer were eluted with elution buffer (50 mM HEPES-NaOH pH 7.4 (RT), 500 mM NaCl, 500 mM imidazole and 10 percent (v/v) glycerol) after extensive wash with lysis buffer. The purified proteins were dialyzed at 4 • C through 3500 Da MWCO regenerated cellulose membranes against formulation buffer (50 mM HEPES-NaOH pH 7.2 (RT)) at 1:5000 volume ratio. Dialyzed samples were filtered through 0.22 μm pore size filters. In parallel, aliquots of the HhXyn5A enzyme variants were also eluted with CtXyn5A formulation buffer as according to manufacturer's product information (35 mM HEPES-NaOH pH 7.5 (RT), 200 mM imidazole, 750 mM NaCl, 3.5 mM CaCl 2 and 25 percent (v/v) glycerol) and filtered through 0.22 μm pore size filters. The protein concentration was determined by measuring absorbance at 280 nm (A 280 1 = 1 mg/mL as well as considering the absorption coefficient) using BioSpec-nano spectrophotometer (Shimadzu). Purity and integrity of proteins were analyzed by visualization applying 4-15 percent glycine-SDS-PAGE (Bio-Rad).

Enzyme activity assay
Enzyme activity was measured by monitoring reducing end formation from different substrates with the di-nitrosalicylic acid (DNS) assay (Miller 1959). One unit (U) of arabinoxylanase activity was defined as the amount of enzyme required to produce 1 μmol of D-xylose reducing sugar equivalents per min from 10 mg/mL WAX in 50 mM HEPES-HCl pH 6.5 (60 • C) buffer. Substrates were suspended at 10 mg/mL in reaction buffer (10 mM HEPES-HCl pH 6.5 (RT)) and added to 4.5 mL glass vials at 9:1 volume ratio of substrate and enzyme preparation of 0.2 mg/mL. The reactions were incubated in thermoshakers at 50 • C for HhXyn5A or HhXyn5A-CAT and 60 • C for CtXyn5A, with shaking at 500 rpm for all reactions. Samples were collected after 10 min, 1 h and 24 h, to be used for DNS assay and oligosaccharide analysis. DNS reagent was added to samples at a 1:1 volume ratio to stop the reaction and the samples were then incubated at 100 • C for 10 min and chilled on ice, before measuring absorbance at 540 nm using with Multiskan GO microplate spectrophotometer (Thermo Fisher Scientific). Samples collected for oligosaccharide analysis were boiled for 10 min, subsequently diluted and filtered through 0.2 μm pore size filters.
Substrates including WAX (low viscosity), RAX and oat BG (medium viscosity) were purchased from Megazyme, whereas BX was purchased from Sigma-Aldrich. Alkali soluble OBF were extracted (Schmitz et al. 2022b) from an insoluble oat fiber bran fraction obtained from oat processing, provided by Lantmännen Oats (Sweden) in 2020 from their production site oat mill in Kimstad (Sweden). In short, the insoluble OBFs were milled, resuspended in water to 50 g/L and destarched using α-amylase and amyloglucosidase after gelatinization at 70 • C for 2 h at 40 • C. After washing with water, cloth filtration and freeze drying, the fibers were resuspended in Milli-Q purity grade water to 100 g/L, sonicated (10 min at 35 kHz) using a UP 400S homogenizer and centrifuged at 3893 × g for 10 min. The collected fibers were further resuspended in 5 M NaOH to 100 g/L (of initial fiber weight) and incubated at 60 • C for 9 h with constant shaking. After centrifugation at 3893 × g for 10 min, the supernatant was neutralized to pH 5-6 using 37 percent (w/w) HCl and the alkali soluble fibers were then precipitated using four volumes of 99 percent (v/v) ethanol overnight at 4 • C. Finally, the fibers recovered by centrifugation at 3893 × g for 5 min to remove the ethanol, washed with water and the extracted fibers were freeze dried. AX content of the oat fibers were determined through acid hydrolysis and quantification of arabinose and xylose through HPAEC-PAD as previously described (Schmitz et al. 2021). All AXOS substrates had a similar A/X ratio of 0.6.

Temperature and pH optimization using experimental design
Optimal temperature and pH for a 10 min reaction of HhXyn5A and CtXyn5A on WAX was investigated with quadratic full-factorial experimental design at three levels using MODDE 12.1 (Sartorius Stedim Data Analytics). To set the experimental conditions, 10 mM Tris-HEPES-acetate buffer at pH 4, 6.5, and 9 was used together with temperatures 30, 58.1 and 70 • C for HhXyn5A and 30, 60 and 90 • C for CtXyn5A. The enzyme activity assay was performed in a micro scale incubating thermocycler with heat gradient mode. The results were then interpreted in MODDE to create a model for prediction of the best reaction conditions for each enzyme.

Enzyme stability
Enzyme melting temperature (T m ) was estimated at different pH ranging 4-9, as well as pH 6.5 with 1 percent (w/v) WAX, using nanoDSF applying a Prometheus NT.48 instrument (NanoTemper Technologies). Standard grade glass capillaries (NanoTemper Technologies) were filled with enzyme solution at a concentration of 0.1-0.2 mg/mL. Thermal unfolding was performed with a temperature gradient between 20 and 95 • C at a 1 • C/min ramp rate and adjusting excitation power to 60 percent. T m was determined from the first derivative of the absorbance ratio 350/330 nm and was identified automatically by the instrument software PR.ThermControl (NanoTemper Technologies).
HhXyn5A thermostability and inactivation over time was investigated by incubating the enzyme at 50 and 60 • C for 24 h while monitoring the retained reaction rate by removing aliquots of enzyme over time and immediately chilling aliquots on ice. Retained activity was then evaluated in a 10 min reaction using 10 mg/mL WAX at 50 • C with the DNS assay performed in a micro scale incubating thermocycler with heat gradient mode measuring absorbance using with Multiskan GO microplate spectrophotometer.

Sequence alignment, homology modeling and docking simulations
In order to review HhXyn5A attribution to the GH5_34 subfamily and define the amino acid conservation, the catalytic domain as well as the CBM6 sequences of the non-redundant GH5_34 subfamily enzymes indexed in the CAZy database (Drula et al. 2021) were retrieved from NCBI GenBank, aligned with HhXyn5A using the ClustalW online multiple sequence alignment tool (Sievers et al. 2011) and visualized using Jalview (Waterhouse et al. 2009). Domain boundaries were determined through NCBI Conserved Domain Database 3.17 (Lu et al. 2019).
A homology model of HhXyn5A was created with SWISS-MODEL (Waterhouse et al. 2018), with a Global Model Quality Estimate score of 0.92 (scored from 0 to 1) and global QMEANDisCo value of 0.89 ± 0.05, using the truncated CtXyn5A as template (PDB 2Y8K). All tertiary structures were superimposed and structurally compared using PyMOL 2.5 (Schrödinger). Amino acid numbering corresponds to the respective enzyme tertiary structure numbering.
Docking of the ligands was then performed using AutoDock (Morris et al. 1998) implemented in YASARA, using the default parameter values supplied and keeping the active site residues flexible throughout the docking simulation. The resulting enzyme-ligand complexes were refined using molecular dynamics simulations performed in YASARA using standard procedure of the software. The simulations were run for 5 ns using the AMBER14 force field for the solute, at 298 K and pH 6.5. The energy minimized structures with similar orientation to the crystallographic complexes were then used as final complexes and visualized in PyMOL. The resulting root-mean-square deviation plots for each molecular dynamics simulation are shown in Supplementary Figures S1-S3.