Skip to content

utils

Utility functions for parsing Quantum ESPRESSO input and output files

FileGuesser(filetype, xml_filename, prefix)

Guesses a filename that matches the XML for a file of a specified filetype. This is done by combining the prefix from the XML with a set of extensions, as well as searching for some common file names, and searching in multiple directories near the XML file.

Filetype Extensions Set filenames Search Directories
pwin $prefix.in, $prefix.pwi bands.in, bands.pwi ./, ../
filproj $prefix, $prefix.proj filproj ./, dos/, pdos/, projwfc, ../pdos/, ../dos/, ../projwfc/
fildos $prefix.dos $prefix.pdos fildos, filpdos ./, dos/, ../pdos/, projwfc/, ../dos/, ../projwfc/

Returns filename for pwin, and filproj/fildos/filpdos for the other types. This means that it will return, for example, dos/$prefix.proj instead of dos/$prefix.proj.projwfc_up for filproj, and dos/$prefix.dos instead of dos/$prefix.dos.pdos_tot for filpdos.

TODO: include outdir in the guessing game.

PARAMETER DESCRIPTION
filetype

The type of file to guess. Can be "pwin", "filproj", "fildos", or "filpdos".

TYPE: str

xml_filename

The filename of the XML file to guess from.

TYPE: str

prefix

The prefix of the XML file.

TYPE: str

Source code in pymatgen/io/espresso/utils.py
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
def __init__(self, filetype, xml_filename, prefix):
    """
    Initializes the FileGuesser object with the filetype, XML filename, and prefix.

    Arguments:
        filetype (str): The type of file to guess. Can be "pwin", "filproj", "fildos", or "filpdos".
        xml_filename (str): The filename of the XML file to guess from.
        prefix (str): The prefix of the XML file.
    """
    self.filetype = filetype
    self.xml_filename = xml_filename
    self.prefix = prefix

    # Validate the filetype and set extensions, extras, and folders
    self._validate_filetype()

guess()

Guesses the appropriate filename based on the filetype, XML filename, and prefix.

RETURNS DESCRIPTION
str

The guessed filename that matches the specified filetype.

RAISES DESCRIPTION
FileNotFoundError

If no appropriate file is found.

Source code in pymatgen/io/espresso/utils.py
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
def guess(self):
    """
    Guesses the appropriate filename based on the filetype, XML filename, and prefix.

    Returns:
        str: The guessed filename that matches the specified filetype.

    Raises:
        FileNotFoundError: If no appropriate file is found.
    """
    guesses = self._generate_guesses()
    print(f"All guesses for filetype = {self.filetype}")
    print(guesses)

    # Filter guesses based on filetype-specific rules
    if self.filetype == "filpdos":
        guesses = [g for g in guesses if glob(f"{g}.pdos_*")]
    elif self.filetype == "filproj":
        guesses = [g for g in guesses if glob(f"{g}.projwfc_*")]
    else:
        guesses = [g for g in guesses if os.path.exists(g)]

    if not guesses:
        raise FileNotFoundError(
            f"All guesses for an appropriate {self.filetype} file don't exist."
        )

    if len(set(guesses)) > 1:
        warnings.warn(
            f"Multiple possible guesses for {self.filetype} found. Using the first one: {guesses[0]}"
        )

    return guesses[0]

IbravUntestedWarning

Bases: UserWarning

Warning for untested ibrav values in ibrav_to_lattice and other related functions.

parse_pwvals(val)

Helper method to recursively parse values in the PWscf xml files. Supports array/list, dict, bool, float and int.

Returns original string (or list of substrings) if no match is found.

Source code in pymatgen/io/espresso/utils.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
def parse_pwvals(
    val: str | dict | list | np.ndarray | None,
) -> str | dict[Any, Any] | list[Any] | np.ndarray | bool | float | int:
    """
    Helper method to recursively parse values in the PWscf xml files. Supports array/list, dict, bool, float and int.

    Returns original string (or list of substrings) if no match is found.
    """
    # regex to match floats but not integers, including scientific notation
    float_regex = r"[+-]?(?=\d*[.eE])(?=\.?\d)\d*\.?\d*(?:[eE][+-]?\d+)?"
    # regex to match just integers (signed or unsigned)
    int_regex = r"^(\+|-)?\d+$"
    if isinstance(val, dict):
        val = {k: parse_pwvals(v) for k, v in val.items()}
    elif isinstance(val, list):
        val = [parse_pwvals(x) for x in val]
    elif isinstance(val, np.ndarray):
        val = [parse_pwvals(x) for x in val]
        # Don't return as array unless all elements are same type
        if all(isinstance(x, type(val[0])) for x in val):
            val = np.array(val)
    elif val is None:
        val = None
    elif not isinstance(val, str):
        return val
    elif " " in val:
        val = [parse_pwvals(x) for x in val.split()]
    elif val.lower() in ("true", ".true."):
        val = True
    elif val.lower() in ("false", ".false."):
        val = False
    elif re.fullmatch(float_regex, val):
        val = float(val)
    elif re.fullmatch(int_regex, val):
        val = int(val)
    return val

ibrav_to_lattice(ibrav, celldm)

Convert ibrav and celldm to lattice parameters. Essentially a reimplementation of latgen.f90 See that module and the PW.x input documentation for more details.

PARAMETER DESCRIPTION
ibrav

The ibrav value (see pw.x input documentation).

TYPE: int

celldm

The celldm values (see pw.x input documentation).

TYPE: list

RETURNS DESCRIPTION
Lattice

The lattice corresponding to the ibrav and celldm values.

Source code in pymatgen/io/espresso/utils.py
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
def ibrav_to_lattice(ibrav, celldm):
    """
    Convert ibrav and celldm to lattice parameters.
    Essentially a reimplementation of latgen.f90
    See that module and the PW.x input documentation for more details.

    Args:
        ibrav (int): The ibrav value (see pw.x input documentation).
        celldm (list): The celldm values (see pw.x input documentation).

    Returns:
        Lattice: The lattice corresponding to the ibrav and celldm values.
    """
    warnings.warn(
        "ibrav != 0 has not been thoroughly tested. Please be careful.",
        IbravUntestedWarning,
    )
    _validate_celldm(ibrav, celldm)
    a = celldm[0]
    if ibrav == 0:
        raise ValueError("ibrav = 0 requires explicit lattice vectors.")
    elif ibrav == 1:
        # cubic P (sc)
        a1 = [a, 0, 0]
        a2 = [0, a, 0]
        a3 = [0, 0, a]
    elif ibrav == 2:
        # cubic F (fcc)
        a1 = [-a / 2, 0, a / 2]
        a2 = [0, a / 2, a / 2]
        a3 = [-a / 2, a / 2, 0]
    elif ibrav == 3:
        # cubic I (bcc)
        a1 = [a / 2, a / 2, a / 2]
        a2 = [-a / 2, a / 2, a / 2]
        a3 = [-a / 2, -a / 2, a / 2]
    elif ibrav == -3:
        # cubic I (bcc), more symmetric axis:
        a1 = [-a / 2, a / 2, a / 2]
        a2 = [a / 2, -a / 2, a / 2]
        a3 = [a / 2, a / 2, -a / 2]
    elif ibrav == 4:
        # Hexagonal and Trigonal P
        c = celldm[2] * a
        a1 = [a, 0, 0]
        a2 = [-a / 2, a * np.sqrt(3) / 2, 0]
        a3 = [0, 0, c]
    elif ibrav == 5:
        # Trigonal R, 3-fold axis c
        # The crystallographic vectors form a three-fold star around
        # the z-axis, the primitive cell is a simple rhombohedron.
        cos_g = celldm[3]  # cos(gamma)
        tx = np.sqrt((1 - cos_g) / 2)
        ty = np.sqrt((1 - cos_g) / 6)
        tz = np.sqrt((1 + 2 * cos_g) / 3)
        a1 = [a * tx, -a * ty, a * tz]
        a2 = [0, 2 * a * ty, a * tz]
        a3 = [-a * tx, -a * ty, a * tz]
    elif ibrav == -5:
        # Trigonal R, 3-fold axis (111);
        # The crystallographic vectors form a three-fold star around (111)
        a_p = a / np.sqrt(3)  # a'
        cos_g = celldm[3]  # cos(gamma)
        tx = np.sqrt((1 - cos_g) / 2)
        ty = np.sqrt((1 - cos_g) / 6)
        tz = np.sqrt((1 + 2 * cos_g) / 3)
        u = tz - 2 * np.sqrt(2) * ty
        v = tz + np.sqrt(2) * ty
        a1 = [a_p * u, a_p * v, a_p * v]
        a2 = [a_p * v, a_p * u, a_p * v]
        a3 = [a_p * v, a_p * v, a_p * u]
    elif ibrav == 6:
        # Tetragonal P (st)
        c = celldm[2] * a
        a1 = [a, 0, 0]
        a2 = [0, a, 0]
        a3 = [0, 0, c]
    elif ibrav == 7:
        # Tetragonal I (bct)
        c = celldm[2] * a
        a1 = [a / 2, -a / 2, c]
        a2 = [a / 2, a / 2, c]
        a3 = [-a / 2, -a / 2, c]
    elif ibrav == 8:
        # Orthorhombic P
        b = celldm[1] * a
        c = celldm[2] * a
        a1 = [a, 0, 0]
        a2 = [0, b, 0]
        a3 = [0, 0, c]
    elif ibrav == 9:
        # Orthorhombic base-centered(bco)
        b = celldm[1] * a
        c = celldm[2] * a
        a1 = [a / 2, b / 2, 0]
        a2 = [-a / 2, b / 2, 0]
        a3 = [0, 0, c]
    elif ibrav == -9:
        # Same as 9, alternate description
        b = celldm[1] * a
        c = celldm[2] * a
        a1 = [a / 2, -b / 2, 0]
        a2 = [a / 2, b / 2, 0]
        a3 = [0, 0, c]
    elif ibrav == 91:
        # Orthorhombic one-face base-centered A-type
        b = celldm[1] * a
        c = celldm[2] * a
        a1 = [a, 0, 0]
        a2 = [0, b / 2, -c / 2]
        a3 = [0, b / 2, c / 2]
    elif ibrav == 10:
        # Orthorhombic face-centered
        b = celldm[1] * a
        c = celldm[2] * a
        a1 = [a / 2, 0, c / 2]
        a2 = [a / 2, b / 2, 0]
        a3 = [0, b / 2, c / 2]
    elif ibrav == 11:
        # Orthorhombic body-centered
        b = celldm[1] * a
        c = celldm[2] * a
        a1 = [a / 2, b / 2, c / 2]
        a2 = [-a / 2, b / 2, c / 2]
        a3 = [-a / 2, -b / 2, c / 2]
    elif ibrav == 12:
        # Monoclinic P, unique axis c
        b = celldm[1] * a
        c = celldm[2] * a
        cos_g = celldm[3]  # cos(gamma)
        sin_g = math.sqrt(1 - cos_g**2)
        a1 = [a, 0, 0]
        a2 = [b * cos_g, b * sin_g, 0]
        a3 = [0, 0, c]
    elif ibrav == -12:
        # Monoclinic P, unique axis b
        b = celldm[1] * a
        c = celldm[2] * a
        cos_b = celldm[4]  # cos(beta)
        sin_b = math.sqrt(1 - cos_b**2)  # sin(beta)
        a1 = [a, 0, 0]
        a2 = [0, b, 0]
        a3 = [c * cos_b, 0, c * sin_b]
    elif ibrav == 13:
        # Monoclinic base-centered (unique axis c)
        b = celldm[1] * a
        c = celldm[2] * a
        cos_g = celldm[3]  # cos(gamma)
        sin_g = math.sqrt(1 - cos_g**2)  # sin(gamma)
        a1 = [a / 2, 0, -c / 2]
        a2 = [b * cos_g, b * sin_g, 0]
        a3 = [a / 2, 0, c / 2]
    elif ibrav == -13:
        warnings.warn(
            "ibrav=-13 has a different definition in QE < v.6.4.1.\n"
            "Please check the documentation. The new definition in QE >= v.6.4.1 is "
            "used by pymatgen.io.espresso.\nThey are related by a1_old = -a2_new, "
            "a2_old = a1_new, a3_old = a3_new."
        )
        b = celldm[1] * a
        c = celldm[2] * a
        cos_b = celldm[4]  # cos(beta)
        sin_b = math.sqrt(1 - cos_b**2)  # sin(beta)
        a1 = [a / 2, b / 2, 0]
        a2 = [-a / 2, b / 2, 0]
        a3 = [c * cos_b, 0, c * sin_b]
    elif ibrav == 14:
        # Triclinic
        b = celldm[1] * a
        c = celldm[2] * a
        cos_g = celldm[3]  # cos(gamma)
        sin_g = math.sqrt(1 - cos_g**2)  # sin(gamma)
        cos_b = celldm[4]  # cos(beta)
        cos_a = celldm[5]  # cos(alpha)
        vol = np.sqrt(1 + 2 * cos_a * cos_b * cos_g - cos_a**2 - cos_b**2 - cos_g**2)

        a1 = [a, 0, 0]
        a2 = [b * cos_g, b * sin_g, 0]
        a3 = [c * cos_b, c * (cos_a - cos_b * cos_g) / sin_g, c * vol / sin_g]
    else:
        raise ValueError(f"Unknown ibrav: {ibrav}.")

    lattice_matrix = np.array([a1, a2, a3])
    return Lattice(lattice_matrix)

projwfc_orbital_to_vasp(l, m)

Given l quantum number and "m" orbital index in projwfc output, convert to the orbital index in VASP (PROCAR).

orbital QE (m/l) VASP
s 0/1 0
pz 1/1 2
px 1/2 3
py 1/3 1
dz2 2/1 6
dxz 2/2 7
dyz 2/3 5
dx2 2/4 8
dxy 2/5 4
Source code in pymatgen/io/espresso/utils.py
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
def projwfc_orbital_to_vasp(l: int, m: int):  # noqa: E741
    """
    Given l quantum number and "m" orbital index in projwfc output,
    convert to the orbital index in VASP (PROCAR).

    | orbital | QE (m/l) | VASP |
    |---------|----------|------|
    | s       | 0/1      |  0   |
    | pz      | 1/1      |  2   |
    | px      | 1/2      |  3   |
    | py      | 1/3      |  1   |
    | dz2     | 2/1      |  6   |
    | dxz     | 2/2      |  7   |
    | dyz     | 2/3      |  5   |
    | dx2     | 2/4      |  8   |
    | dxy     | 2/5      |  4   |

    """
    if l < 0 or l > 2:
        raise ValueError(f"l must be 0, 1, or 2. Got {l}.")
    if m < 1 or m > 2 * l + 1:
        raise ValueError(f"m must be between 1 and 2*l+1. Got {m}.")
    l_map = [[0], [2, 3, 1], [6, 7, 5, 8, 4]]
    return l_map[l][m - 1]