Release 2024.07.01
Notable Changes
Stream Queue Message Growth Fix
Streams ingestion could cause the stream queues to grow rapidly if the ingestion jobs failed part way through since they would then start from the beginning again. To fix this we now track the progress as we are initially loading data to a stream so that we can continue from where it ended if there is error part way through.
Release Notes
CluedIn
Features
-
Additional liveness checks to detect rabbitmq disconnects on processing pods.
Normally rabbitmq connections should auto recover within 5 min. If this doesn’t happen. Then the liveness probe will report “Red” status. From this point Kubernetes liveness thresholds will kick in.
Effectively this means:
CluedIn server side will not report failing liveness probe within the first 5 min (to allow that connection can auto recover by it self)
Kubernetes liveness probe 10min-25min (depending if there is liveness probe timeouts)
In total 15min-30min before kubernetes will restart the pod
The following settings have been added:
Configuration key |
Default Value |
Health.SystemLivenessChecks.Enabled |
true |
Health.SystemLivenessChecks.InitialDelaySeconds |
600 |
Health.SystemLivenessChecks.CheckIntervalSeconds |
30 |
Health.SystemLivenessChecks.ConsecutiveErrorCountThreshold |
10 |
-
Make RabbitMQ messaging more robust by using confirmation of the delivery and forcing all messages to be persisted to disk
Fixes
- Synchronous call paths over asynchronous operations can lead to reduced throughput in the task scheduler
- Strong typing upgrade scenario is targeting the incorrect version
- Part id compaction during entity merging could cause save errors
- Application could occasionally freeze whilst obtaining a lock
- Deduplication might not find result with certain vocabulary typing configurations
- Queries targeting untyped boolean keys can return incorrect values
-
Entity edges with the same reference points will merge even if the properties are different in the EntityEdgeCollection
Old behaviour can be re-enabled with configuration key Feature.EntityEdgeCollection.LegacyEdgeMergingWithoutConsideringPropertiesEnabled
- Streams can bloat messages in the queues if the ingestion job fails part way through
CluedIn.MicroServices
Features
- Make RabbitMQ messaging more robust by forcing all messages to be persisted to disk
Fixes
- Datasource can crash when a queue is non existent
- Getting the count from a SQL import could cause errors
- Removing a data set does not remove its schedule jobs
- Unable to remove golden records after removing a processed data set
CluedIn.UI
Features
- Added the ability to cancel the remove records job from a data source
Fixes
- Unable to remove a data set after removing the associated records
Runtime-Environment
Features
- Added ingestion callback data column to the streams table
- Grant CREATE TYPE to db_executor to improve clean performance
Packages
For this release, kindly utilize the precise versions listed below for the following packages
Connectors
Name |
Version |
CluedIn.Connector.AzureDataLake |
4.3.0 |
CluedIn.Connector.AzureDedicatedSqlPool |
4.0.0 |
CluedIn.Connector.AzureEventHub |
4.0.0 |
CluedIn.Connector.AzureServiceBus |
4.0.0 |
CluedIn.Connector.Http |
4.0.0 |
CluedIn.Connector.SqlServer |
4.1.0 |
CluedIn.PowerApps |
4.3.0 |
CluedIn.Connector.Dataverse |
4.3.0 |
CluedIn.Connector.OneLake |
4.3.0 |
Enrichers
Name |
Version |
CluedIn.ExternalSearch.Providers.DuckDuckGo.Provider |
4.0.0 |
CluedIn.ExternalSearch.Providers.PermId.Provider |
4.0.0 |
CluedIn.ExternalSearch.Providers.Web |
4.1.0 |
CluedIn.Provider.ExternalSearch.Bregg |
4.0.0 |
CluedIn.Provider.ExternalSearch.ClearBit |
4.1.0 |
CluedIn.Provider.ExternalSearch.CompanyHouse |
4.0.0 |
CluedIn.Provider.ExternalSearch.CVR |
4.1.0 |
CluedIn.Provider.ExternalSearch.Gleif |
4.0.0 |
CluedIn.Provider.ExternalSearch.GoogleMaps |
4.1.0 |
CluedIn.Provider.ExternalSearch.KnowledgeGraph |
4.0.0 |
CluedIn.Provider.ExternalSearch.Libpostal |
4.1.0 |
CluedIn.Provider.ExternalSearch.OpenCorporates |
4.0.0 |
CluedIn.Provider.ExternalSearch.Providers.VatLayer |
4.0.0 |
Crawlers
Name |
Version |
CluedIn.Crawling.MasterDataServices |
4.0.0 |
CluedIn.Purview |
4.3.0 |
Other
Name |
Version |
CluedIn.Vocabularies.CommonDataModel |
4.3.0 |
CluedIn.EventHub |
4.3.0 |
Controller
Docker Image |
Tags |
cluedin/controller |
2024.07.01 , 2024.07 , 4.3 , 4.3.0 , 4.3.0_91677 |
Gql
Docker Image |
Tags |
cluedin/cluedin-ui-gql |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98587 |
Microservices
Docker Image |
Tags |
cluedin/data-source |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98697 |
cluedin/data-source-processing |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98697 |
cluedin/data-source |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98697 |
cluedin/data-source-processing |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98697 |
cluedin/data-source-submitter |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98697 |
cluedin/data-source |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98697 |
Runtime
Docker Image |
Tags |
cluedin/neo4j |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98824 |
cluedin/openrefine |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98824 |
Server
Docker Image |
Tags |
cluedin/cluedin-server |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98590 , 4.3.1_98590-alpine , 4.3.1-alpine , 4.3-alpine |
cluedin/cluedin-server |
2024.07.01 , 2024.07 , 4.3.1_98590-ubuntu , 4.3.1-ubuntu , 4.3-ubuntu |
cluedin/nuget-installer |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98590 , 4.3.1_98590-alpine , 4.3.1-alpine , 4.3-alpine |
cluedin/nuget-installer |
2024.07.01 , 2024.07 , 4.3.1_98590-ubuntu , 4.3.1-ubuntu , 4.3-ubuntu |
Ui
Docker Image |
Tags |
cluedin/ui |
2024.07.01 , 2024.07 , 4.3 , 4.3.1 , 4.3.1_98586 |