Estoy escribiendo una consulta que se utilizará para buscar resultados en una fuente social. El concepto es que la aplicación móvil solicitará N elementos y proporcionará una fecha y hora de inicio que he llamado a @CutoffTime
continuación. El propósito del tiempo de corte es establecer cuándo debe comenzar la ventana de paginación. La razón por la que estamos usando una marca de tiempo en lugar de un desplazamiento de fila es la marca de tiempo que nos permitirá pasar de una página coherente a la hora de obtener publicaciones antiguas, incluso si se agrega contenido social más nuevo.
Dado que los elementos del feed social pueden ser de usted o de sus amigos, estoy usando un UNION
para combinar los resultados de esos dos grupos. Originalmente probé la TheQuery_CTE
lógica sin el UNION
y fue muy lento.
Esto es lo que he hecho (incluido el esquema de tabla pertinente):
CREATE TABLE [Content].[Photo]
(
[PhotoId] INT NOT NULL PRIMARY KEY IDENTITY (1, 1),
[Key] UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID(),
[FullResolutionUrl] NVARCHAR(255) NOT NULL,
[Description] NVARCHAR(255) NULL,
[Created] DATETIME2(2) NOT NULL DEFAULT SYSUTCDATETIME(),
);
CREATE TABLE [Content].[UserPhotoAssociation]
(
[PhotoId] INT NOT NULL,
[UserId] INT NOT NULL,
[ShowInSocialFeed] BIT NOT NULL DEFAULT 0,
CONSTRAINT [PK_UserPhotos] PRIMARY KEY ([PhotoId], [UserId]),
CONSTRAINT [FK_UserPhotos_User] FOREIGN KEY ([UserId])
REFERENCES [User].[User]([UserId]),
CONSTRAINT [FK_UserPhotos_Photo] FOREIGN KEY ([PhotoId])
REFERENCES [Content].[Photo]([PhotoId])
);
CREATE TABLE [Content].[FlaggedPhoto]
(
[FlaggedPhotoId] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
[PhotoId] INT NOT NULL,
[FlaggedBy] INT NOT NULL,
[FlaggedOn] DATETIME2(0) NOT NULL DEFAULT SYSDATETIME(),
[FlaggedStatus] INT NOT NULL DEFAULT 1,
[ReviewedBy] INT NULL,
[ReviewedAt] DATETIME2(0) NULL
CONSTRAINT [FK_Photos_PhotoId_to_FlaggedPhotos_PhotoId] FOREIGN KEY ([PhotoId])
REFERENCES [Content].[Photo]([PhotoId]),
CONSTRAINT [FK_FlaggedPhotoStatus_FlaggedPhotoStatusId_to_FlaggedPhotos_FlaggedStatus] FOREIGN KEY ([FlaggedStatus])
REFERENCES [Content].[FlaggedContentStatus]([FlaggedContentStatusId]),
CONSTRAINT [FK_User_UserId_to_FlaggedPhotos_FlaggedBy] FOREIGN KEY ([FlaggedBy])
REFERENCES [User].[User]([UserId]),
CONSTRAINT [FK_User_UserId_to_FlaggedPhotos_ReviewedBy] FOREIGN KEY ([ReviewedBy])
REFERENCES [User].[User]([UserId])
);
CREATE TABLE [User].[CurrentConnections]
(
[MonitoringId] INT NOT NULL PRIMARY KEY IDENTITY,
[Monitor] INT NOT NULL,
[Monitored] INT NOT NULL,
[ShowInSocialFeed] BIT NOT NULL DEFAULT 1,
CONSTRAINT [FK_Monitoring_Monitor_to_User_UserId] FOREIGN KEY ([Monitor])
REFERENCES [dbo].[User]([UserId]),
CONSTRAINT [FK_Monitoring_Monitored_to_User_UserId] FOREIGN KEY ([Monitored])
REFERENCES [dbo].[User]([UserId])
);
CREATE TABLE [Content].[PhotoLike]
(
[PhotoLikeId] INT NOT NULL PRIMARY KEY IDENTITY,
[PhotoId] INT NOT NULL,
[UserId] INT NOT NULL,
[Created] DATETIME2(2) NOT NULL DEFAULT SYSUTCDATETIME(),
[Archived] DATETIME2(2) NULL,
CONSTRAINT [FK_PhotoLike_PhotoId_to_Photo_PhotoId] FOREIGN KEY ([PhotoId])
REFERENCES [Content].[Photo]([PhotoId]),
CONSTRAINT [FK_PhotoLike_UserId_to_User_UserId] FOREIGN KEY ([UserId])
REFERENCES [User].[User]([UserId])
);
CREATE TABLE [Content].[Comment]
(
[CommentId] INT NOT NULL PRIMARY KEY IDENTITY,
[PhotoId] INT NOT NULL,
[UserId] INT NOT NULL,
[Comment] NVARCHAR(255) NOT NULL,
[Created] DATETIME2(2) NOT NULL DEFAULT SYSUTCDATETIME(),
[CommentOrder] DATETIME2(2) NOT NULL DEFAULT SYSUTCDATETIME(),
[Archived] DATETIME2(2) NULL,
CONSTRAINT [FK_Comment_PhotoId_to_Photo_PhotoId] FOREIGN KEY ([PhotoId])
REFERENCES [Content].[Photo]([PhotoId]),
CONSTRAINT [FK_Comment_UserId_to_User_UserId] FOREIGN KEY ([UserId])
REFERENCES [User].[User]([UserId])
);
/*
End table schema
*/
DECLARE @UserId INT,
@NumberOfItems INT,
@CutoffTime DATETIME2(2) = NULL -- Stored Proc input params
-- Make the joins and grab the social data we need once since they are used in subsequent queries that aren't shown
DECLARE @SocialFeed TABLE ([Key] UNIQUEIDENTIFIER, [PhotoId] INT
, [Description] NVARCHAR(255), [FullResolutionUrl] NVARCHAR(255)
, [Created] DATETIME2(2), [CreatorId] INT, [LikeCount] INT
, [CommentCount] INT, [UserLiked] BIT);
-- Offset might be different for each group
DECLARE @OffsetMine INT = 0, @OffsetTheirs INT = 0;
IF @CutoffTime IS NOT NULL
BEGIN
-- Get the offsets
;WITH [GetCounts_CTE] AS
(
SELECT
[P].[PhotoId] -- INT
, 1 AS [MyPhotos]
FROM [Content].[Photo] [P]
INNER JOIN [Content].[UserPhotoAssociation] [UPA] ON
[UPA].[PhotoId] = [P].[PhotoId]
AND
[UPA].[ShowInSocialFeed] = 1
LEFT JOIN [Content].[FlaggedPhoto] [FP] ON
[FP].[PhotoId] = [P].[PhotoId]
AND
[FP].[FlaggedStatus] = 3 -- Flagged photos that are confirmed apply to everyone
WHERE
[FP].[FlaggedPhotoId] IS NULL -- Filter out flagged photos
AND
[UPA].[UserId] = @UserId -- Show the requesting user
AND
[P].[Created] >= @CutoffTime -- Get the newer items
UNION
SELECT
[P].[PhotoId] -- INT
, 0 AS [MyPhotos]
FROM [Content].[Photo] [P]
INNER JOIN [Content].[UserPhotoAssociation] [UPA] ON
[UPA].[PhotoId] = [P].[PhotoId]
AND
[UPA].[ShowInSocialFeed] = 1
INNER JOIN [User].[CurrentConnections] [M] ON
[M].[Monitored] = [UPA].[UserId]
AND
[M].[Monitor] = @UserId AND [M].[ShowInSocialFeed] = 1 -- this join isn't present above
LEFT JOIN [Content].[FlaggedPhoto] [FP] ON
[FP].[PhotoId] = [P].[PhotoId]
AND
(
[FP].[FlaggedStatus] = 3
OR
([FP].[FlaggedBy] = @UserId AND [FP].[FlaggedStatus] = 1)
) -- Flagged photos that are confirmed apply to everyone, pending flags apply to the user
WHERE
[FP].[FlaggedPhotoId] IS NULL -- Filter out flagged photos
AND
[P].[Created] >= @CutoffTime -- Get the newer items
)
SELECT
@OffsetMine = SUM(CASE WHEN [MyPhotos] = 1 THEN 1 ELSE 0 END)
, @OffsetTheirs = SUM(CASE WHEN [MyPhotos] = 0 THEN 1 ELSE 0 END)
FROM [GetCounts_CTE]
END
-- Prevent absence of social data from throwing an error below.
SET @OffsetMine = ISNULL(@OffsetMine, 0);
SET @OffsetTheirs = ISNULL(@OffsetTheirs, 0);
-- Actually select the data I want
;WITH TheQuery_CTE AS
(
SELECT
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
, COUNT(DISTINCT [PL].[PhotoLikeId]) AS [LikeCount] -- Count distinct used due to common join key
, COUNT(DISTINCT [C].[CommentId]) AS [CommentCount]
, CAST(ISNULL(MAX(CASE WHEN [PL].[UserId] = @UserId THEN 1 END), 0) AS BIT) AS [UserLiked]
FROM [Content].[Photo] [P]
INNER JOIN [Content].[UserPhotoAssociation] [UPA] ON
[UPA].[PhotoId] = [P].[PhotoId]
AND
[UPA].[ShowInSocialFeed] = 1
LEFT JOIN [Content].[PhotoLike] [PL] ON
[PL].[PhotoId] = [P].[PhotoId]
AND
[PL].[Archived] IS NULL
LEFT JOIN [Content].[Comment] [C] ON
[C].[PhotoId] = [P].[PhotoId]
AND
[C].[Archived] IS NULL
LEFT JOIN [Content].[FlaggedPhoto] [FP] ON
[FP].[PhotoId] = [P].[PhotoId]
AND
[FP].[FlaggedStatus] = 3 -- Flagged photos that are confirmed apply to everyone
WHERE
[FP].[FlaggedPhotoId] IS NULL -- Filter out flagged photos
AND
[UPA].[UserId] = @UserId -- Show the requesting user
GROUP BY
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
ORDER BY
[P].[Created] DESC
, [P].[Key] -- Ensure consistent order in case of duplicate timestamps
OFFSET @OffsetMine ROWS FETCH NEXT @NumberOfItems ROWS ONLY
UNION
SELECT
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
, COUNT(DISTINCT [PL].[PhotoLikeId]) AS [LikeCount]
, COUNT(DISTINCT [C].[CommentId]) AS [CommentCount]
, CAST(ISNULL(MAX(CASE WHEN [PL].[UserId] = @UserId THEN 1 END), 0) AS BIT) AS [UserLiked]
FROM [Content].[Photo] [P]
INNER JOIN [Content].[UserPhotoAssociation] [UPA] ON
[UPA].[PhotoId] = [P].[PhotoId]
AND
[UPA].[ShowInSocialFeed] = 1
INNER JOIN [User].[CurrentConnections] [M] ON
[M].[Monitored] = [UPA].[UserId]
AND
[M].[Monitor] = @UserId AND [M].[ShowInSocialFeed] = 1
LEFT JOIN [Content].[PhotoLike] [PL] ON
[PL].[PhotoId] = [P].[PhotoId]
AND
[PL].[Archived] IS NULL
LEFT JOIN [Content].[Comment] [C] ON
[C].[PhotoId] = [P].[PhotoId]
AND
[C].[Archived] IS NULL
LEFT JOIN [Content].[FlaggedPhoto] [FP] ON
[FP].[PhotoId] = [P].[PhotoId]
AND
(
[FP].[FlaggedStatus] = 3
OR
([FP].[FlaggedBy] = @UserId AND [FP].[FlaggedStatus] = 1)
) -- Flagged photos that are confirmed apply to everyone, pending flags apply to the user
WHERE
[FP].[FlaggedPhotoId] IS NULL -- Filter out flagged photos
GROUP BY
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
ORDER BY
[P].[Created] DESC
, [P].[Key] -- Ensure consistant order in case of duplicate timestamps
OFFSET @OffsetTheirs ROWS FETCH NEXT @NumberOfItems ROWS ONLY
)
INSERT INTO @SocialFeed ([Key], [PhotoId], [Description], [FullResolutionUrl]
, [Created], [CreatorId], [LikeCount], [CommentCount], [UserLiked])
SELECT TOP (@NumberOfItems)
[Key]
, [PhotoId]
, [Description]
, [FullResolutionUrl]
, [Created]
, [UserId]
, [LikeCount]
, [CommentCount]
, [UserLiked]
FROM [TheQuery_CTE]
ORDER BY -- Order here so the top works properly
[Created] DESC
, [Key] -- Ensure consistent order in case of duplicate timestamps
-- Output the social feed
SELECT
[P].[Key]
, [P].[PhotoId]
, [P].[Description] AS [PhotoDescription]
, [P].[FullResolutionUrl]
, [P].[Created] AS [Posted]
, [P].[CreatorId]
, [LikeCount]
, [CommentCount]
, [UserLiked]
FROM @Photos [P]
-- Select other data needed to build the object tree in the application layer
Me doy cuenta de que puedo deshacerme de él UNION
en el GetCounts_CTE
pero no creo que realmente resuelva ninguno de los problemas que veo a continuación.
Veo algunos problemas potenciales:
- Esa es mucha lógica duplicada, así que probablemente me esté haciendo la vida más difícil.
- Si ocurre una inserción entre el cálculo del recuento y la selección de los datos, voy a estar apagado. No creo que esto suceda con frecuencia, pero conduciría a errores raros / difíciles de depurar.
- Todos los problemas más inteligentes / más experiencia que las personas encontrarían con la configuración anterior.
¿Cuál es la mejor manera de escribir esta consulta? Puntos de bonificación: la solución me simplifica la vida.
Editar:
No quiero seleccionar todos los datos y dejar que el cliente muestre los elementos perezosamente, porque no quiero abusar de los planes de datos de las personas obligándolos a descargar elementos que nunca verán. Es cierto que los datos probablemente no serán tan grandes en el gran esquema de las cosas, pero la acumulación de centavos ...
Edición 2:
Sospecho firmemente que esta no es la solución óptima, pero es la mejor que he encontrado hasta ahora.
Mover mi UNION
consulta a una VIEW
como Greg sugirió que funcionó bien para ocultar esa lógica y dar una consulta más concisa en mi procedimiento almacenado. La vista también abstrae la fealdad / complicación de la unión, lo cual es bueno porque lo estoy usando dos veces en mi selección. Aquí está el código para la vista:
CREATE VIEW [Social].[EverFeed]
AS
SELECT
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
, COUNT(DISTINCT [PL].[PhotoLikeId]) AS [LikeCount] -- Distinct due to common join key
, COUNT(DISTINCT [C].[CommentId]) AS [CommentCount]
, CAST(ISNULL(
MAX(CASE WHEN [PL].[UserId] = [UPA].[UserId] THEN 1 END), 0) AS BIT) AS [UserLiked]
, NULL AS [Monitor]
FROM [Content].[Photo] [P]
INNER JOIN [Content].[UserPhotoAssociation] [UPA] ON
[UPA].[PhotoId] = [P].[PhotoId]
AND
[UPA].[ShowInSocialFeed] = 1
LEFT JOIN [Content].[PhotoLike] [PL] ON
[PL].[PhotoId] = [P].[PhotoId]
AND
[PL].[Archived] IS NULL
LEFT JOIN [Content].[Comment] [C] ON
[C].[PhotoId] = [P].[PhotoId]
AND
[C].[Archived] IS NULL
LEFT JOIN [Content].[FlaggedPhoto] [FP] ON
[FP].[PhotoId] = [P].[PhotoId]
AND
[FP].[FlaggedStatus] = 3 -- Flagged photos that are confirmed apply to everyone
WHERE
[FP].[FlaggedPhotoId] IS NULL -- Filter out flagged photos
GROUP BY
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
UNION
SELECT
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
, COUNT(DISTINCT [PL].[PhotoLikeId]) AS [LikeCount]
, COUNT(DISTINCT [C].[CommentId]) AS [CommentCount]
, CAST(ISNULL(
MAX(CASE WHEN [PL].[UserId] = [M].[Monitor] THEN 1 END), 0) AS BIT) AS [UserLiked]
, [M].[Monitor]
FROM [Content].[Photo] [P]
INNER JOIN [Content].[UserPhotoAssociation] [UPA] ON
[UPA].[PhotoId] = [P].[PhotoId]
AND
[UPA].[ShowInSocialFeed] = 1
INNER JOIN [User].[CurrentConnections] [M] ON
[M].[Monitored] = [UPA].[UserId]
AND
[M].[ShowInSocialFeed] = 1
LEFT JOIN [Content].[PhotoLike] [PL] ON
[PL].[PhotoId] = [P].[PhotoId]
AND
[PL].[Archived] IS NULL
LEFT JOIN [Content].[Comment] [C] ON
[C].[PhotoId] = [P].[PhotoId]
AND
[C].[Archived] IS NULL
LEFT JOIN [Content].[FlaggedPhoto] [FP] ON
[FP].[PhotoId] = [P].[PhotoId]
AND
(
[FP].[FlaggedStatus] = 3
OR
([FP].[FlaggedBy] = [M].[Monitor] AND [FP].[FlaggedStatus] = 1)
) -- Flagged photos that are confirmed (3) apply to everyone
-- , pending flags (1) apply to the user
WHERE
[FP].[FlaggedPhotoId] IS NULL -- Filter out flagged photos
GROUP BY
[P].[Key]
, [P].[PhotoId]
, [P].[Description]
, [P].[FullResolutionUrl]
, [P].[Created]
, [UPA].[UserId]
, [M].[Monitor]
Usando esa vista, acorté mi consulta a lo siguiente. Tenga en cuenta que estoy configurando OFFSET
con una subconsulta.
DECLARE @UserId INT, @NumberOfItems INT, @CutoffTime DATETIME2(2);
SELECT
[Key]
, [PhotoId]
, [Description]
, [FullResolutionUrl]
, [Created]
, [UserId]
, [LikeCount]
, [CommentCount]
, [UserLiked]
FROM [Social].[EverFeed] [EF]
WHERE
(
([EF].[UserId] = @UserId AND [EF].[Monitor] IS NULL)
OR
[EF].[Monitor] = @UserId
)
ORDER BY -- Order here so the top works properly
[Created] DESC
, [Key] -- Ensure consistant order in case of duplicate timestamps
OFFSET CASE WHEN @CutoffTime IS NULL THEN 0 ELSE
(
SELECT
COUNT([PhotoId])
FROM [Social].[EverFeed] [EF]
WHERE
(
([EF].[UserId] = @UserId AND [EF].[Monitor] IS NULL)
OR
[EF].[Monitor] = @UserId
)
AND
[EF].[Created] >= @CutoffTime -- Get the newer items
) END
ROWS FETCH NEXT @NumberOfItems ROWS ONLY
La vista separa muy bien la complejidad del UNION
filtrado. Creo que la subconsulta en la OFFSET
cláusula evitará problemas de concurrencia que me preocupaban al hacer que toda la consulta sea atómica.
Un problema que acabo de encontrar al escribir esto es: en el código anterior, si dos fotos con la misma fecha de creación están en "páginas" diferentes, las fotos en las páginas siguientes se filtrarán. Considere los siguientes datos:
PhotoId | Created | ...
------------------------
1 | 2015-08-26 01:00.00
2 | 2015-08-26 01:00.00
3 | 2015-08-26 01:00.00
Con un tamaño de página de 1 en la página inicial, PhotoId 1
se devolverá. Con el mismo tamaño de página en la segunda página, no se devolverán resultados. Creo que para resolver esto voy a tener que agregar el Key
Guid como parámetro ...