Genealogica Grafica: errors related to gedcom inconsistencies

 13 Nov 2010
Covered in this document:  Covered in related documents:

Gedcom structure

A Gedcom file records the family structure in a standard format. This standard is maintained by The Church of Jesus Christ of Latter-day Saints (LDS) (see here for details).

Apart from header and trailer, the file consists of records for individuals (INDI records) and for relationships (FAM records). Each such record is uniquely named with an identifier. For INDI's this is mostly a number preceded by I (capital i) and FAM records normally get an F prefix for a unique number. These identifiers are only meaningful within the gedcom. They are necessary to link INDI's to FAM's and vice versa.

INDI records contain details of individuals, such as their name, place and date of birth and death, and occupation. There can also be references to FAM records, for instance to indicate the family from which the person was born. For such a reference, the FAM identifier is used.

 FAM records contain the details of a relationship (typically: a marriage), such as place and dates. And of course the individuals who play a role. There can be a HUSBand and a WIFE and a number of CHILdren. These are all referenced by their own ID.

With this way of relating INDI's to FAM's and vice versa, it is feasible to record birth from one family and adoption from another, dual marriages, unknown parents (HUSB or WIFE missing in the FAM record), etc.

In the example below John Smith married Mary Green and they got a son Jacob Smith.

0 @I001@ INDI
1 NAME John Smith
2 DATE abt 1870
2 PLAC London
1 FAMS @F01@

0 @I002@ INDI
1 NAME Mary Green
1 FAMS @F01@

0 @I003@ INDI
1 NAME Jacob Smith
1 FAMC @F01@

0 @F01@ FAM
1 HUSB @I001@
1 WIFE @I002@
1 CHIL @I003@

The FAMS reference for John and Mary indicate that they are partner in FAM F01. For Jacob, the FAMC reference indicates that he is a child in F01.

Nb: in a real gedcom there will be no blank lines. The numbers preceding each line indicate a data structure: a higher number says this is a detail of the previous data (eg the birth data of John). 

Possible gedcom inconsistencies

The references between INDI and FAM records can lead to inconsistencies. For instance: what to do if John Smith with SEX = Male shows up in FAM F01 as WIFE ?

Such an error does not occur very often, but the following do:

  • The INDI record specifies that the person relates as a partner (FAMS) or a child (FAMC) to a particular FAM, but that FAM is itself not in the gedcom.
  • A FAM record lists INDI's as HUSB, WIFE or CHIL, but the referenced individual is not in the file, or if it is, it does not itself confirm that it relates to the FAM (as FAMS or FAMC).

Genealogica Grafica verifies that such inconsistencies do not occur, since they may frustrate the calculations. The corresponding errors are explained below and how to cope with them is described further down.

The source of these errors is in your genealogical program that produced the gedcom. Most likely they are the result of changes you made over time, such as deleting a person. Clearly, the program should also delete that person from any marriage he/she was involved in. But sometimes this goes wrong.

Another source of error that we've seen is a corruption of the gedcom itself. If for instance an End-of-line character is missing then the next line will not be read. If this line is the start of an INDI or FAM record, that person or family is lost and its data is seen as part of the current record. Such errors may result from improper handling of special characters by your program and from (data) transmission errors. You can only find them by analysis of the gedcom file.

There is another kind of inconsistency, namely that someone is his own descendant (result of assigning an ancestor when entering a child). This is a so-called 'loop'.  When you trace ancestors or descendants from this person, you will find a path that continues indefinitely since you will encounter the person itself and then move on. Loops are notoriously hard to detect but Genealogica Grafica will do it for you. And it will also pinpoint which links in the loop are suspect.

Also, there may be duplicate entries in your gedcom. A situation to be avoided at all times. In order to find these, we notice that two records of the same person can not be linked via family relationships. Hence, if present, the gedcom must contain isolated groups of persons. These may form small families, but none of its members connect to the main body of the gedcom. Genealogica Grafica will list these groups and then scan them for duplication of individuals.

Date inconsistencies

We check various date relationships in order to assist you in locating link errors and data entry errors:

  • the order of birth, baptism, marriage, separation, death and burial
  • the age of partners when they marry and separate
  • the age of mother and father at birth of their children
  • the period between births of the same mother (9 months minimum unless stillborn), a warning when births are more than 25 years apart (a re-marriage of the father may be missing).
  • irregularities in the baptising of children (e.g. not in same order as birth).
Some of these checks are straightforward. Checking that death is after birth, for instance, can be done on individual INDI records. Other tests are more involved. To verify that a mother was older than (say) 15 before she got her first child, you need to correlate the records for mother, her fam relationships and all the children.

Genealogica Grafica does not stop there. It will 'reason' about all the available dates in relation to each other. This algorithm, akin to Artificial Intelligence, can indicate inconsistencies accross generations - in particular when dates have not been recorded. Comsider for instance the case where birth dates of mother and grand-child are 20 years apart, while the birth date of the daughter is not known. Genealogica Grafica will draw your attention to this situation, since there is most likely a recording error.

What to do with errors

You will need to correct gedcom consistency errors. They can corrupt the Genealogica Grafica outputs. Beware that in many cases they also compromise the results of your own genealogical program. Genealogica Grafica will remove inconsistent links for you, but this may result in the loss of an essential family relationship.

Some of the errors may be hard to locate. Since each genealogical program has its own user interface, there is no standard approach. Typically, you would find the person involved, check the family links and then add or remove marriage or birth data.

To assist you in this correction process, a file is produced with details of the errors reported (see example). This file is named 'GedcomErrors.HTM' and should be found in the directory from where your gedcom is read or where you save your settings.

From within Genealogica Grafica you can inspect this file via 'Windows - Gedcom Errors'. The links then bring you to the actual records in the gedcom, so that you can check at the source. And the 'Nearby Family' display and various search options will ease the process of pinpointing what's wrong. Some corrections can be made in Genealogica Grafica via the edit menu in the gedcom view window. You can change dates, places, names, notes, etc; but you can not change IDs, links or order of records. For that you should return to your registration program (which can be open at the same time).