Pymatgen Tutorial¶

In this tutorial, we will quickly learn Pymatgen (Python Materials Genomics), a robust, open-source Python library for materials analysis. It is very similar to ASE but it is better written and maintained. It also has more functions and tools for VASP code. This tutorial, we will focus on applying transformations for our own needs.

Installation¶

Activate your virtual environment and then install Pymatgen through pip.

source mace/bin/activate
pip install pymatgen

You also need to set POTCAR directory if you want to use Pymatgen to generate VASP input files. Please see details here

Tranformations¶

One of the reasons we use Pymatgen is that it has lots of useful transformations so that we don't need to rebuild wheels. You can find more details in the API. Following transformations will be covered in this tutorial:

SupercellTransformation: This transformation replicates a unit cell to a supercell.
OxidationStateDecorationTransformation: This transformation decorates a structure with oxidation states.
SubstitutionTransformation: This transformation substitutes species for one another.
OrderDisorderedStructureTransformation: Order a disordered structure. The disordered structure must be oxidation state decorated for Ewald sum to be computed. No attempt is made to perform symmetry determination to reduce the number of combinations.

Transmuter¶

We need to create transmuter for applying transformations. Below we use a CifTransmuter to read the structure and then use SubstitutionTransformation to replace 30% Al to Fe. You can also use PoscarTransmuter (read POSCAR or CONTCAR files from VASP) or StandardTransmuter (read Pymatgen Structure) for the same purpose if you have different structure files.

You can get the cif file of of Al(HCOO)₃ here: https://files.matsci.dev/Al_empty.cif

In [ ]:

Copied!





from pymatgen.transformations.standard_transformations import SubstitutionTransformation
from pymatgen.alchemy.transmuters import CifTransmuter

transmuter = CifTransmuter.from_filenames(["Al_empty.cif"],primitive=False)
substitution_dict = {"Al": {"Fe":0.3,"Al":0.7}}
transmuter.append_transformation(SubstitutionTransformation(substitution_dict))
transmuter.transformed_structures[-1].to(filename="Al_substituted.cif")
from pymatgen.transformations.standard_transformations import SubstitutionTransformation
from pymatgen.alchemy.transmuters import CifTransmuter

transmuter = CifTransmuter.from_filenames(["Al_empty.cif"],primitive=False)
substitution_dict = {"Al": {"Fe":0.3,"Al":0.7}}
transmuter.append_transformation(SubstitutionTransformation(substitution_dict))
transmuter.transformed_structures[-1].to(filename="Al_substituted.cif")

Ordering¶

We need to order the disordered structure because program cannot deal with disorder explicitly. Since many orderings are symmetrically equivalent, we can use RemoveDuplicatesFilter to remove these duplicate structures. This filter uses StructureMatcher to match structures. You can tune parameters in StructureMatcher to let the algorithm decide whether two structures are the same or not.

The ordering process might take long time because it needs the evaluation of electrostatic energy. Structure matching is also time consuming. In addition, the total possible permutations grows factorially. Therefore, you should limit the size of your supercell before you do it. Try to estimate total number of permutations before you perform this process.
You should also make sure the site occupancy matches with the total number of sites in supercell.

In [ ]:

Copied!





from pymatgen.transformations.standard_transformations import SubstitutionTransformation, OxidationStateDecorationTransformation, OrderDisorderedStructureTransformation
from pymatgen.alchemy.transmuters import CifTransmuter
from pymatgen.alchemy.filters import RemoveDuplicatesFilter
from pymatgen.analysis.structure_matcher import StructureMatcher

transmuter = CifTransmuter.from_filenames(["Al_empty.cif"],primitive=False)
substitution_dict = {"Al": {"Fe":0.5,"Al":0.5}}
oxi_dict={"Al":3,"Fe":3,"O":-2,"C":4,"H":1}
transmuter.append_transformation(SubstitutionTransformation(substitution_dict))
transmuter.append_transformation(OxidationStateDecorationTransformation(oxi_dict))
transmuter.append_transformation(OrderDisorderedStructureTransformation(),extend_collection=500)
print(f'Total ordering: {len(transmuter)}')
transmuter.apply_filter(RemoveDuplicatesFilter(StructureMatcher()))
print(f'Total ordering after removing duplicates: {len(transmuter)}')
for i,structure in enumerate(transmuter.transformed_structures):
    structure.final_structure.to(filename=f"Al_substituted_ordered_{i}.cif")
from pymatgen.transformations.standard_transformations import SubstitutionTransformation, OxidationStateDecorationTransformation, OrderDisorderedStructureTransformation
from pymatgen.alchemy.transmuters import CifTransmuter
from pymatgen.alchemy.filters import RemoveDuplicatesFilter
from pymatgen.analysis.structure_matcher import StructureMatcher

transmuter = CifTransmuter.from_filenames(["Al_empty.cif"],primitive=False)
substitution_dict = {"Al": {"Fe":0.5,"Al":0.5}}
oxi_dict={"Al":3,"Fe":3,"O":-2,"C":4,"H":1}
transmuter.append_transformation(SubstitutionTransformation(substitution_dict))
transmuter.append_transformation(OxidationStateDecorationTransformation(oxi_dict))
transmuter.append_transformation(OrderDisorderedStructureTransformation(),extend_collection=500)
print(f'Total ordering: {len(transmuter)}')
transmuter.apply_filter(RemoveDuplicatesFilter(StructureMatcher()))
print(f'Total ordering after removing duplicates: {len(transmuter)}')
for i,structure in enumerate(transmuter.transformed_structures):
    structure.final_structure.to(filename=f"Al_substituted_ordered_{i}.cif")

Export ordered structures and prepare VASP input files¶

In VASP, we usually need the following input files:

INCAR: input parameters
POSCAR: structure (similar to .cif)
POTCAR: pseudopotential
KPOINTS: k-points

You can use transmuter.write_vasp_input() to write VASP input files for all structures in the transmuter. Pymatgen uses a VASP input set (here we use MITRelaxSet) to define inputs for VASP calculations. You just need to pass your structure to the input set, and then you can generate the input. You can also modify the input for your own needs using arguments starting with user_, e.g., user_potcar_functional.

In [33]:

Copied!





from pymatgen.transformations.standard_transformations import SubstitutionTransformation, OxidationStateDecorationTransformation, OrderDisorderedStructureTransformation, SupercellTransformation
from pymatgen.alchemy.transmuters import CifTransmuter
from pymatgen.alchemy.filters import RemoveDuplicatesFilter
from pymatgen.analysis.structure_matcher import StructureMatcher
from pymatgen.io.vasp.sets import MITRelaxSet

transmuter = CifTransmuter.from_filenames(["Al_empty.cif"],primitive=False)
transmuter.append_transformation(SupercellTransformation.from_scaling_factors(2,1,1))
substitution_dict = {"Al": {"Fe":0.125,"Al":0.875}}
oxi_dict={"Al":3,"Fe":3,"O":-2,"C":4,"H":1}
transmuter.append_transformation(SubstitutionTransformation(substitution_dict))
transmuter.append_transformation(OxidationStateDecorationTransformation(oxi_dict))
transmuter.append_transformation(OrderDisorderedStructureTransformation(),extend_collection=500)
print(f'Total ordering: {len(transmuter)}')
transmuter.apply_filter(RemoveDuplicatesFilter(StructureMatcher()))
print(f'Total ordering after removing duplicates: {len(transmuter)}')

incar_dict = { 'EDIFF': 1e-5, 'EDIFFG': -1e-2, 'IVDW': 11, 'ISYM':2,'NSW':1500}
transmuter.write_vasp_input(vasp_input_set=MITRelaxSet,user_potcar_functional='PBE_54', user_incar_settings=incar_dict,output_dir='DFT_calcs', include_cif=True)
from pymatgen.transformations.standard_transformations import SubstitutionTransformation, OxidationStateDecorationTransformation, OrderDisorderedStructureTransformation, SupercellTransformation
from pymatgen.alchemy.transmuters import CifTransmuter
from pymatgen.alchemy.filters import RemoveDuplicatesFilter
from pymatgen.analysis.structure_matcher import StructureMatcher
from pymatgen.io.vasp.sets import MITRelaxSet

transmuter = CifTransmuter.from_filenames(["Al_empty.cif"],primitive=False)
transmuter.append_transformation(SupercellTransformation.from_scaling_factors(2,1,1))
substitution_dict = {"Al": {"Fe":0.125,"Al":0.875}}
oxi_dict={"Al":3,"Fe":3,"O":-2,"C":4,"H":1}
transmuter.append_transformation(SubstitutionTransformation(substitution_dict))
transmuter.append_transformation(OxidationStateDecorationTransformation(oxi_dict))
transmuter.append_transformation(OrderDisorderedStructureTransformation(),extend_collection=500)
print(f'Total ordering: {len(transmuter)}')
transmuter.apply_filter(RemoveDuplicatesFilter(StructureMatcher()))
print(f'Total ordering after removing duplicates: {len(transmuter)}')

incar_dict = { 'EDIFF': 1e-5, 'EDIFFG': -1e-2, 'IVDW': 11, 'ISYM':2,'NSW':1500}
transmuter.write_vasp_input(vasp_input_set=MITRelaxSet,user_potcar_functional='PBE_54', user_incar_settings=incar_dict,output_dir='DFT_calcs', include_cif=True)

Total ordering: 120
Total ordering after removing duplicates: 11

Create VASP Input¶

Sometimes, we just want to generate VASP input with a structure file, e.g. CIF.

You need to set VASP_PSP_DIR for POTCAR.

We choose POTCAR based on Materials Project documentation. You can also check more information in VASP wiki.

If you don't understand any VASP input parameters, please refer to VASP Manual.

In [10]:

Copied!





from pymatgen.core import Structure
from pymatgen.io.vasp.sets import MITRelaxSet

incar_dict = { 'EDIFFG': -1e-2, 'IVDW': 11, 'ISYM':2,'NSW':1500, 'ENCUT':520}
structure = Structure.from_file("Al_empty.cif")
inputset = MITRelaxSet(structure = structure,user_incar_settings=incar_dict, 
                       user_kpoints_settings={'length':25})
inputset.write_input(output_dir='./DFT_calc',include_cif=True)
from pymatgen.core import Structure
from pymatgen.io.vasp.sets import MITRelaxSet

incar_dict = { 'EDIFFG': -1e-2, 'IVDW': 11, 'ISYM':2,'NSW':1500, 'ENCUT':520}
structure = Structure.from_file("Al_empty.cif")
inputset = MITRelaxSet(structure = structure,user_incar_settings=incar_dict, 
                       user_kpoints_settings={'length':25})
inputset.write_input(output_dir='./DFT_calc',include_cif=True)

Conversion between ASE Atoms to Pymatgen Structure¶

We use ASE to run machine learning potential computations so we need to convert Pymatgen Structure to ASE Atoms using AseAtomsAdaptor. Since Pymatgen doesn't support formats such as extxyz, you might need to do the conversion using AseAtomsAdaptor.

In [ ]:

Copied!





from pymatgen.io.ase import AseAtomsAdaptor
from pymatgen.core import Structure
from ase.io import read
# convert structure to atoms
structure = Structure.from_file("Al_empty.cif")
ase_atoms = AseAtomsAdaptor.get_atoms(structure) 
print(ase_atoms)

# convert atoms to structure
ase_atoms = read("Al_empty.cif")
structure = AseAtomsAdaptor.get_structure(ase_atoms)
print(structure)
from pymatgen.io.ase import AseAtomsAdaptor
from pymatgen.core import Structure
from ase.io import read
# convert structure to atoms
structure = Structure.from_file("Al_empty.cif")
ase_atoms = AseAtomsAdaptor.get_atoms(structure) 
print(ase_atoms)

# convert atoms to structure
ase_atoms = read("Al_empty.cif")
structure = AseAtomsAdaptor.get_structure(ase_atoms)
print(structure)